如何从 CosmosDB 中的所有文档中获取唯一类别?

How do I get UNIQUE categories from all documents in CosmosDB?

我在 CosmosDB 中使用 SQL API 有数百万个文档,我需要从所有文档中找到唯一的类别。

文档如下所示,您可以在描述下方看到类别数组,我不关心它们的顺序我只需要知道集合中所有文档中所有唯一的,我需要这个稍后我可以在类别上创建查询,但这是一个稍后的问题,我首先需要将它们全部拿出来,所以我知道所有可能的选项是什么,但我无法弄清楚执行此操作的查询,以便我只得到类别名称。

{
    "id": "56d934d3-90bf-4f5a-b602-e515fefa599f",
    "_id": "5bf6705f9568cf00013cd13c",
    "vendor": "XXX",
    "updatedAt": "2018-11-23T03:55:30.044Z",
    "locales": [
        {
            "title": "Cold shoulder t-shirt",
            "description": "Because collar bones. Trending cold shoulder t-shirt in 100% organic cotton. Classic, wide and boxy t-shirt fit with cut-out details. In black, because black tees and fashion are like this (insert friendly hand gesture). This style is online exclusive.",
            "categories": [
                "Women",
                "clothing",
                "tops"
            ],
            "brand": null,
            "images": [
                "https://lp.xxx.com/app002prod?set=source[01_0659881_001_102],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
                "https://lp.xxx.com/app002prod?set=source[01_0659881_001_203],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
                "https://lp.xxx.com/app002prod?set=source[01_0659881_001_301],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
                "https://lp.xxx.com/app002prod?set=source[02_0659881_001_101],type[PRODUCT],device[hdpi],quality[80],ImageVersion[1.0]&call=url[file:/product/main]"
            ],
            "country": "SE",
            "currency": "SEK",
            "language": "en",
            "variants": [
                {
                    "artno": "0659881001",
                    "urls": [
                        "https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
                    ],
                    "price": 80,
                    "stock": 0,
                    "attributes": {
                        "size": "XXS",
                        "color": "Black magic"
                    }
                },
                {
                    "artno": "xxx",
                    "urls": [
                        "https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
                    ],
                    "price": 80,
                    "stock": 0,
                    "attributes": {
                        "size": "XS",
                        "color": "Black magic"
                    }
                },
                {
                    "artno": "0659881001",
                    "urls": [
                        "https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
                    ],
                    "price": 80,
                    "stock": 0,
                    "attributes": {
                        "size": "XL",
                        "color": "Black magic"
                    }
                },
                {
                    "artno": "0659881001",
                    "urls": [
                        "https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
                    ],
                    "price": 80,
                    "stock": 0,
                    "attributes": {
                        "size": "S",
                        "color": "Black magic"
                    }
                },
                {
                    "artno": "0659881001",
                    "urls": [
                        "https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
                    ],
                    "price": 80,
                    "stock": 1,
                    "attributes": {
                        "size": "M",
                        "color": "Black magic"
                    }
                },
                {
                    "artno": "0659881001",
                    "urls": [
                        "https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
                    ],
                    "price": 80,
                    "stock": 0,
                    "attributes": {
                        "size": "L",
                        "color": "Black magic"
                    }
                }
            ]
        }
    ],
    "_rid": "QEwcALNbIz8GAAAAAAAAAA==",
    "_self": "dbs/QEwcAA==/colls/QEwcALNbIz8=/docs/QEwcALNbIz8GAAAAAAAAAA==/",
    "_etag": "\"6a0003c6-0000-0000-0000-5bf7958c0000\"",
    "_attachments": "attachments/",
    "_ts": 1542952332
}

请看我的测试,它可以获取所有唯一的类别名称。

示例文档:

[
    {
        "id": "1",
        "locales": [
            {
                "categories": [
                    "Women",
                    "clothing",
                    "tops"
                ]
            }
        ]
    },
    {
        "id": "2",
        "locales": [
            {
                "categories": [
                    "Men",
                    "test",
                    "tops"
                ]
            }
        ]
    }
]

SQL:

SELECT distinct cat FROM c
join l in c.locales
join cat in l.categories

输出:

[
    {
        "cat": "Women"
    },
    {
        "cat": "clothing"
    },
    {
        "cat": "tops"
    },
    {
        "cat": "Men"
    },
    {
        "cat": "test"
    }
]

如果您不想区分大小写,只需使用 sql 中的 LOWER 函数即可。

SELECT distinct Lower(cat) FROM c
join l in c.locales
join cat in l.categories

如果你想得到["Women","clothing","tops","Men","test"],它不能直接解析为单个sql中的数组,你可以使用stored procedure来解析输出数组。

例如在存储过程中添加如下代码

    var returnArray = [];
    for(var i=0 ;i<array.size;i++){
        returnArray.push(array[i].value)
    }
    return returnArray;