SQL 随机名称生成器未在同一行中插入名字和姓氏

SQL random name generator not inserting first and last name in the same row

我设法创建了一个简单的查询,它随机选择名字和姓氏并将它们插入结果 table。我想创建一些可以与我 运行 必须制造大量数据的各种测试互换的东西。这是代码(为了简单起见,我只包含了 5 个名字和姓氏):

SELECT 
    FirstName, LastName
FROM
    (SELECT TOP 1 
         FirstName 
     FROM 
         (SELECT 'John' AS FirstName 
          UNION SELECT 'Tim' AS FirstName 
          UNION SELECT 'Laura' AS FirstName
          UNION SELECT 'Jeff' AS FirstName
          UNION SELECT 'Sara' AS FirstName) AS First_Names 
     ORDER BY NEWID()) n1
FULL OUTER JOIN 
    (SELECT TOP 1
         LastName 
     FROM (SELECT 'Johnson' AS LastName 
           UNION SELECT 'Hudson' AS LastName 
           UNION SELECT 'Jackson' AS LastName
           UNION SELECT 'Ranallo' AS LastName
           UNION SELECT 'Curry' AS LastName) AS Last_Names 
     ORDER BY NEWID()) n2 ON [n1].FirstName = [n2].LastName
WHERE 
    n1.FirstName IS NOT NULL OR n2.LastName IS NOT NULL

结果如下:

FirstName LastName 
NULL      Hudson
John      NULL

我希望结果 return 一行的名字和姓氏是随机生成的,这样每一行都有一个完整的名字(没有 NULL 值)。我确定我忽略了一些简单的事情。

问题出在您的加入上。您可以这样做:

SELECT FirstName, LastName
FROM
(SELECT TOP 1 FirstName 
FROM (SELECT 'John' AS FirstName 
UNION SELECT 'Tim' AS FirstName 
UNION SELECT 'Laura' AS FirstName
UNION SELECT 'Jeff' AS FirstName
UNION SELECT 'Sara' AS FirstName) AS First_Names ORDER BY NEWID())n1
CROSS JOIN 
(SELECT TOP 1 LastName 
FROM (SELECT 'Johnson' AS LastName 
UNION SELECT 'Hudson' AS LastName 
UNION SELECT 'Jackson' AS LastName
UNION SELECT 'Ranallo' AS LastName
UNION SELECT 'Curry' AS LastName) AS Last_Names ORDER BY NEWID())n2

以下代码将允许您生成一系列随机名称,其中交叉联接解决方案一次只允许一个名称。不确定您是否需要多次执行此操作,但如果您这样做:

create table #table (firstname varchar(50), lastname varchar(50))

declare @counter int = 1 
declare @max int = 5 --set number of repetitions here
declare @a varchar(50)
declare @b varchar(50)

while @counter <= @max
begin

SET @a = (SELECT TOP 1 FirstName 
    FROM (SELECT 'John' AS FirstName 
    UNION SELECT 'Tim' AS FirstName 
    UNION SELECT 'Laura' AS FirstName
    UNION SELECT 'Jeff' AS FirstName
    UNION SELECT 'Sara' AS FirstName) AS First_Names ORDER BY NEWID())

SET @b =
    (SELECT TOP 1 LastName 
    FROM (SELECT 'Johnson' AS LastName 
    UNION SELECT 'Hudson' AS LastName 
    UNION SELECT 'Jackson' AS LastName
    UNION SELECT 'Ranallo' AS LastName
    UNION SELECT 'Curry' AS LastName) AS Last_Names ORDER BY NEWID())

    insert into #table values (@a, @b)

    set @counter = @counter + 1
end

select * from #table

如果您需要获得超过 1 个组合,这里是对此类事情使用循环的替代方法。

declare @max int = 5;

with FirstNames(FName) as
(
    SELECT 'John' UNION ALL
    SELECT 'Tim' UNION ALL
    SELECT 'Laura' UNION ALL
    SELECT 'Jeff' UNION ALL
    SELECT 'Sara'
)
, LastNames(LName) as
(
    SELECT 'Johnson' UNION ALL
    SELECT 'Hudson' UNION ALL
    SELECT 'Jackson' UNION ALL
    SELECT 'Ranallo' UNION ALL
    SELECT 'Curry'
)
, SortedNames(FName, LName, RowNum) as
(
    select FName
        , LName
        , ROW_NUMBER() over (Order by newid())
    from FirstNames
    cross join LastNames
)

select FName
    , LName
from SortedNames
where RowNum <= @max
order by NEWID();

如果您想高效地生成多行(您确实提到需要生成大量示例数据),那么这是我今天早些时候在 S.O 上的另一个问题上发布的内容。 ()。该问题涉及随机化 4 个字段而不是两个字段,但我将这些额外的字段保留在这里,以便您了解适应您可能拥有的其他场景是多么容易。删除任何字段都相当简单,并且也很容易向 4 table 变量中的任何一个添加新值(以增加可能组合的数量),因为查询会动态调整随机化范围以适应每个 table 变量(即第 1 - n 行)中的任何数据。

DECLARE @TelNumber TABLE (TelNumberID INT NOT NULL IDENTITY(1, 1),
                          Num VARCHAR(30) NOT NULL);
INSERT INTO @TelNumber (Num) VALUES ('1525407'), ('5423986'), ('1245398'), ('32657891'),
                                    ('123658974'), ('7896534'), ('12354698');

DECLARE @FirstName TABLE (FirstNameID INT NOT NULL IDENTITY(1, 1),
                          Name NVARCHAR(30) NOT NULL);
INSERT INTO @FirstName (Name) VALUES ('Babak'), ('Carolin'), ('Martin'), ('Marie'),
                  ('Susane'), ('Michail'), ('Ramona'), ('Ulf'), ('Dirk'), ('Sebastian');

DECLARE @LastName TABLE (LastNameID INT NOT NULL IDENTITY(1, 1),
                         Name NVARCHAR(30) NOT NULL);
INSERT INTO @LastName (Name) VALUES ('Bastan'), ('Krause'), ('Rosner'),
                  ('Gartenmeister'), ('Rentsch'), ('Benn'), ('Kycik'), ('Leuoth'),
                  ('Kamkar'), ('Kolaee');

DECLARE @Address TABLE (AddressID INT NOT NULL IDENTITY(1, 1),
                        Addr NVARCHAR(100) NOT NULL);
INSERT INTO @Address (Addr) VALUES ('Deutschlan Chemnitz Sonnenstraße 59'), (''),
  ('Deutschland Chemnitz Arthur-Strobel straße 124'),
  ('Deutschland Chemnitz Brückenstraße 3'),
  ('Iran Shiraz Chamran Blvd, Niayesh straße Nr.155'), (''),
  ('Deutschland Berlin Charlotenburg Pudbulesky Alleee 52'),
  ('United State of America Washington DC. Farbod Alle'), ('');

DECLARE @RowsToInsert INT = 10000;

;WITH rowcounts AS
(
  SELECT (SELECT COUNT(*) FROM @TelNumber) AS [TelNumberRows],
         (SELECT COUNT(*) FROM @FirstName) AS [FirstNameRows],
         (SELECT COUNT(*) FROM @LastName) AS [LastNameRows],
         (SELECT COUNT(*) FROM @Address) AS [AddressRows]
), nums AS
(
  SELECT TOP (@RowsToInsert)
         (CRYPT_GEN_RANDOM(1) % rc.TelNumberRows) + 1 AS [RandomTelNumberID],
         (CRYPT_GEN_RANDOM(1) % rc.FirstNameRows) + 1 AS [RandomFirstNameID],
         (CRYPT_GEN_RANDOM(1) % rc.LastNameRows) + 1 AS [RandomLastNameID],
         (CRYPT_GEN_RANDOM(1) % rc.AddressRows) + 1 AS [RandomAddressID]
  FROM   rowcounts rc
  CROSS JOIN msdb.sys.all_columns sac1
  CROSS JOIN msdb.sys.all_columns sac2
)
-- INSERT dbo.Unsprstb(Firstname, Lastname, Tel, Address)
SELECT fn.Name, ln.Name, tn.Num, ad.Addr
FROM   @FirstName fn
FULL JOIN nums
        ON nums.RandomFirstNameID = fn.FirstNameID
FULL JOIN @LastName ln
        ON ln.LastNameID = nums.RandomLastNameID
FULL JOIN @TelNumber tn
        ON tn.TelNumberID = nums.RandomTelNumberID
FULL JOIN @Address ad
        ON ad.AddressID = nums.RandomAddressID;

备注:

  • 需要 FULL JOIN 而不是 INNER JOIN 才能获得整个 @RowsToInsert 行数。
  • 由于这种随机化的本质,并且没有使用 DISTINCT 过滤掉它们,因此可能存在重复行。但是,DISTINCT 不能与问题中给定的样本数据一起使用,因为每个数组/table 变量中的元素数量仅提供 6300 个唯一组合,并且要求生成的行数为 10,000。如果将更多值添加到 table 变量,使得可能的唯一组合总数超过请求的行数,则可以将 DISTINCT 关键字添加到 nums CTE,或者查询可以重组为简单地 CROSS JOIN 所有 table 变量,包括一个 ROW_COUNT() 字段,并使用 ORDER BY NEWID() 获取 TOP(n)(但该方法有它的优缺点也是如此)。
  • INSERT 已被注释掉,因此更容易看出上面的查询产生了所需的结果。只需取消注释 INSERT 即可让查询执行实际的 DML 操作。