数据库与数据集市与数据仓库与数据湖
Database vs DataMart vs Data Warehouse vs Data Lake
在
中寻找高层differences/comparison
- 数据库
- 数据集市(自上而下的方法)
- 数据仓库
- 数据湖
具体情况不详时,请使用相对比较。
下面包含了所提到的各种数据层之间的高级比较。如果其中任何一个需要更正,请随时发表评论。
注:执行HTML查看结果
#dataTierComparison {
font-family: "Trebuchet MS", Arial, Helvetica, sans-serif;
border-collapse: collapse;
width: 100%;
}
#dataTierComparison td,
#dataTierComparison th {
border: 1px solid #ddd;
padding: 8px;
}
#dataTierComparison tr:nth-child(even) {
background-color: #f2f2f2;
}
#dataTierComparison tr:hover {
background-color: #ddd;
}
#dataTierComparison th {
padding-top: 12px;
padding-bottom: 12px;
text-align: left;
background-color: #4CAF50;
color: white;
<table id="dataTierComparison">
<tbody>
<tr>
<th> </th>
<th>Database</th>
<th>Data Mart (Top-down)</th>
<th>Data Warehouse</th>
<th>Data Lake</th>
</tr>
<tr>
<th>Source</th>
<td>Single</td>
<td>Single</td>
<td>Multiple</td>
<td>Multiple</td>
</tr>
<tr>
<th>Structure</th>
<td>Structured</td>
<td>Structured</td>
<td>Structured</td>
<td>Raw</td>
</tr>
<tr>
<th>Purpose</th>
<td>Determined</td>
<td>Determined</td>
<td>Determined</td>
<td>Undertermined</td>
</tr>
<tr>
<th>Storage</th>
<td>Centralized</td>
<td>Decentralized</td>
<td>Centralized</td>
<td>Centralized</td>
</tr>
<tr>
<th>Data Format</th>
<td>Detailed</td>
<td>Summarized</td>
<td>Detailed</td>
<td>All</td>
</tr>
<tr>
<th>Flexibility</th>
<td>Low</td>
<td>Medium</td>
<td>Medium</td>
<td>High</td>
</tr>
<tr>
<th>Primary Use</th>
<td>Transactional</td>
<td>Reporting</td>
<td>Analytics & Reporting</td>
<td>Analytics</td>
</tr>
<tr>
<th>Cost</th>
<td>Low</td>
<td>Medium</td>
<td>Medium</td>
<td>High</td>
</tr>
<tr>
<th>Data Volume</th>
<td>Low</td>
<td>Low</td>
<td>Medium</td>
<td>High</td>
</tr>
<tr>
<th>Development</th>
<td>Top-down</td>
<td>Bottom-up</td>
<td>Top-down</td>
<td>All</td>
</tr>
<tr>
<th>Design Time</th>
<td>Medium</td>
<td>Medium</td>
<td>High</td>
<td>Low</td>
</tr>
<tr>
<th>Volatility</th>
<td>Medium</td>
<td>Low</td>
<td>None</td>
<td>None</td>
</tr>
<tr>
<th>Data Operations</th>
<td>CRUD</td>
<td>CR</td>
<td>CRU</td>
<td>CR</td>
</tr>
<tr>
<th>Subject Area</th>
<td>Single</td>
<td>Single</td>
<td>Multiple</td>
<td>Multiple</td>
</tr>
<tr>
<th>Design Schema</th>
<td>Relational</td>
<td>Multi-dimensional</td>
<td>Relational</td>
<td>No Schema</td>
</tr>
</tbody>
</table>
在
中寻找高层differences/comparison- 数据库
- 数据集市(自上而下的方法)
- 数据仓库
- 数据湖
具体情况不详时,请使用相对比较。
下面包含了所提到的各种数据层之间的高级比较。如果其中任何一个需要更正,请随时发表评论。
注:执行HTML查看结果
#dataTierComparison {
font-family: "Trebuchet MS", Arial, Helvetica, sans-serif;
border-collapse: collapse;
width: 100%;
}
#dataTierComparison td,
#dataTierComparison th {
border: 1px solid #ddd;
padding: 8px;
}
#dataTierComparison tr:nth-child(even) {
background-color: #f2f2f2;
}
#dataTierComparison tr:hover {
background-color: #ddd;
}
#dataTierComparison th {
padding-top: 12px;
padding-bottom: 12px;
text-align: left;
background-color: #4CAF50;
color: white;
<table id="dataTierComparison">
<tbody>
<tr>
<th> </th>
<th>Database</th>
<th>Data Mart (Top-down)</th>
<th>Data Warehouse</th>
<th>Data Lake</th>
</tr>
<tr>
<th>Source</th>
<td>Single</td>
<td>Single</td>
<td>Multiple</td>
<td>Multiple</td>
</tr>
<tr>
<th>Structure</th>
<td>Structured</td>
<td>Structured</td>
<td>Structured</td>
<td>Raw</td>
</tr>
<tr>
<th>Purpose</th>
<td>Determined</td>
<td>Determined</td>
<td>Determined</td>
<td>Undertermined</td>
</tr>
<tr>
<th>Storage</th>
<td>Centralized</td>
<td>Decentralized</td>
<td>Centralized</td>
<td>Centralized</td>
</tr>
<tr>
<th>Data Format</th>
<td>Detailed</td>
<td>Summarized</td>
<td>Detailed</td>
<td>All</td>
</tr>
<tr>
<th>Flexibility</th>
<td>Low</td>
<td>Medium</td>
<td>Medium</td>
<td>High</td>
</tr>
<tr>
<th>Primary Use</th>
<td>Transactional</td>
<td>Reporting</td>
<td>Analytics & Reporting</td>
<td>Analytics</td>
</tr>
<tr>
<th>Cost</th>
<td>Low</td>
<td>Medium</td>
<td>Medium</td>
<td>High</td>
</tr>
<tr>
<th>Data Volume</th>
<td>Low</td>
<td>Low</td>
<td>Medium</td>
<td>High</td>
</tr>
<tr>
<th>Development</th>
<td>Top-down</td>
<td>Bottom-up</td>
<td>Top-down</td>
<td>All</td>
</tr>
<tr>
<th>Design Time</th>
<td>Medium</td>
<td>Medium</td>
<td>High</td>
<td>Low</td>
</tr>
<tr>
<th>Volatility</th>
<td>Medium</td>
<td>Low</td>
<td>None</td>
<td>None</td>
</tr>
<tr>
<th>Data Operations</th>
<td>CRUD</td>
<td>CR</td>
<td>CRU</td>
<td>CR</td>
</tr>
<tr>
<th>Subject Area</th>
<td>Single</td>
<td>Single</td>
<td>Multiple</td>
<td>Multiple</td>
</tr>
<tr>
<th>Design Schema</th>
<td>Relational</td>
<td>Multi-dimensional</td>
<td>Relational</td>
<td>No Schema</td>
</tr>
</tbody>
</table>