提取数据模式 sql

Extracting patterns of data sql

我正在尝试根据订单状态列中模式 SLD 和 SOLD 的检查来确定商品是否已售出或库存,否则它将是库存

此外,如果 * 或 BAM 出现在 orderstatus 列中,则 BAMyesorno 列将为 BAM

对于字段 soldorstockdate,如果它存在(格式 mm/dd/yyyy),则将输入来自 orderstatus 列的日期,否则将输入来自 orddate 的日期

orderstatus orddate comment BAM-Yes or no Soldorstockdate
*SLD 05/11/2022 5/1/2022 Sold BAM 5/11/2022
*SOLD 05/15/2022 5/8/2022 Sold BAM 5/15/2022
37141 SLD BAM 5/5/2022 Sold BAM 5/5/2022
*STOCK 05/16/2022 5/3/2022 Stock BAM 5/16/2022
1277489 STK#39298.32831 5/4/2022 Stock 5/4/2022
36888 SLD FLOREN ANGEL 5/6/2022 Sold 5/6/2022
11274848 5/5/2022 Stock 5/5/2022

我尝试了以下方法:

SELECT 
    *,
    CASE 
        WHEN INSTR('%SLD%', `ORDERSTATUS`) > 0 
            THEN 'Sold'
        WHEN INSTR('%SOLD%', `ORDERSTATUS`) > 0 
             
        ELSE'Stock' 
    END AS comment,
    CASE 
        WHEN INSTR('%[0-9]/[0-9]%', `ORDERSTATUS`) > 0 
             OR LOCATE('*', `ORDERSTATUS`) > 0 
             OR LOCATE('BAM', `ORDERSTATUS`) > 0
            THEN 'BAM' 
            ELSE'' 
    END AS BAMYN,
    CASE 
        WHEN INSTR('%[0-9]/[0-9]%', `ORDERSTATUS`) > 0 
            THEN CAST(SUBSTRING(`ORDERSTATUS`, LOCATE('/', `ORDERSTATUS`) - 2, 5)  AS DATE)
            ELSE `ORD_DATE` 
    END AS soldorstockdate 
FROM 
    table

Input table

create table ##input
(segment varchar(20),
mmodel varchar(40),
brand varchar(30),
orderstatus varchar(100),
orddate date)

    insert into ##input values
    ('maka','M12E4','Nimg','*SLD 05/11/2022','5/1/2022'),
    ('sika','KL6781','Cheung','37141 SLD BAM','5/5/2022'),
    ('kloi','NB1290','Vloti','1277489 STK#39298.32831','5/4/2022'),
    ('Ping','BN1289','gower','36888 SLD FLOREN ANGEL','5/6/2022'),
    ('Melow','VB1901','operw','1286664 051222','5/10/2022'),
    ('Bekow','XC901','mewar','*SLD 5/14/22 Heman','5/3/2022'),
    ('Nakin','JH121','korew','STOCK','5/16/2022'),
    ('Verura','CV123','thilla','1287002 LONGMINT','5/12/2022'),
    ('Chaluli','BN8901','dora','STOCK BAM 5/17/22','5/11/2022'),
    ('Kroger','XC123','iops','*STOCK BAM 5/23/22','5/8/2022'),
    ('beqow','VB123','pirar','3902120 STOCK','5/20/2022'),
    ('Viast','NM41W','kolpe','SOLD BRANDON BOX 36790','5/15/2022'),
    ('Chimmin','BN123','tyrow','STK 5/13','5/3/2022'),
    ('Bellow','Vio23','Callow','*STK 5/13/22','5/5/2022'),
    ('Nalla','Krowmin','Gilqa','37938 STOCK 5/18/22 PER SARA','5/18/2022')

Output table

create table ##output
(segment varchar (20),
mmodel varchar(40),
brand varchar(30),
orderstatus varchar(100),
orddate date,
comment varchar(40),
BAMYN varchar(10),
soldorstockdate date)

insert into ##output values
('maka','M12E4','Nimg','*SLD 05/11/2022','5/1/2022','Sold','BAM','5/11/2022'),
('sika','KL6781','Cheung','37141 SLD BAM','5/5/2022','Sold','BAM','5/5/2022'),
('kloi','NB1290','Vloti','1277489 STK#39298.32831','5/4/2022','','',''),
('Ping','BN1289','gower','36888 SLD FLOREN ANGEL','5/6/2022','Sold','','5/6/2022'),
('Melow','VB1901','operw','1286664 051222','5/10/2022','','',''),
('Bekow','XC901','mewar','*SLD 5/14/22 Heman','5/3/2022','Sold','BAM','5/14/2022'),
('Nakin','JH121','korew','STOCK','5/16/2022','Stock','','5/16/2022'),
('Verura','CV123','thilla','1287002 LONGMINT','5/12/2022','','',''),
('Chaluli','BN8901','dora','STOCK BAM 5/17/22','5/11/2022','Stock','BAM','5/17/2022'),
('Kroger','XC123','iops','*STOCK BAM 5/23/22','5/8/2022','Stock','BAM','5/23/2022'),
('beqow','VB123','pirar','3902120 STOCK','5/20/2022','Stock','','5/20/2022'),
('Viast','NM41W','kolpe','SOLD BRANDON BOX 36790','5/15/2022','Sold','','5/15/2022'),
('Chimmin','BN123','tyrow','STK 5/13/2022','5/3/2022','Stock','BAM','5/13/2022'),
('Bellow','Vio23','Callow','*STK 5/13/22','5/5/2022','Stock','BAM','5/13/2022'),
('Nalla','Krowmin','Gilqa','37938 STOCK 5/18/22 PER SARA','5/18/2022','Stock','BAM','5/18/2022')

我似乎无法涵盖没有年份的短日期(Chimmin 条目为 5/13)

试试这个修改后的例子SQL,我相信它应该非常接近预期。尽管我在输出示例中也对空字符串日期('')感到困惑。

SELECT 
    *,
    CASE 
        WHEN REGEXP_LIKE(`ORDERSTATUS`, 'SLD|SOLD') > 0 
            THEN 'Sold'
        ELSE 'Stock' 
    END AS comment,
    CASE 
        WHEN REGEXP_INSTR(`ORDERSTATUS`, '[0-9]{1,2}/[0-9]{1,2}/[0-9]{2,4}') > 0 
             OR LOCATE('*', `ORDERSTATUS`) > 0 
             OR LOCATE('BAM', `ORDERSTATUS`) > 0
            THEN 'BAM' 
            ELSE '' 
    END AS BAMYN,
    CASE 
        WHEN REGEXP_LIKE(`ORDERSTATUS`, '[0-9]{1,2}/[0-9]{1,2}/[0-9]{2,4}') > 0 
            THEN DATE_FORMAT(
                     STR_TO_DATE(
                         REGEXP_SUBSTR(`ORDERSTATUS`, '[0-9]{1,2}/[0-9]{1,2}/[0-9]{2,4}'), 
                         IF(LENGTH(REGEXP_SUBSTR(`ORDERSTATUS`, '[0-9]{1,2}/[0-9]{1,2}/[0-9]{2,4}')) < 10, '%m/%d/%y', '%m/%d/%Y')
                     ),
                     '%Y-%m-%d'
                 )
            ELSE `ORDDATE` 
    END AS soldorstockdate
FROM 
    input

也可以用这个 example dbfiddle 试试。

如有任何关于处理的问题,请随时提出。