上采样:在向量的每个连续元素之间插入额外的值
Upsampling: insert extra values between each consecutive elements of a vector
假设我们有一个由 20 个浮点数组成的向量 V。是否可以在每对这些浮点数之间插入值,使向量 V 成为恰好包含 50 个数字的向量。
插入的值应该是介于上限值和下限值之间的随机数我决定在两者之间插入两个值的中点。
我尝试了以下方法:
vector<double> upsample(vector<double>& in)
{
vector<double> temp;
for (int i = 1; i <= in.size() - 1 ; i++)
{
double sample = (in[i] + in[i - 1]) / 2;
temp.push_back(in[i - 1]);
temp.push_back(sample);
}
temp.push_back(in.back());
return temp;
}
使用此函数,输入向量元素增加 2(n) - 1(20 个元素变为 39)。输入向量的不同大小可能小于 50。
我认为可以通过在两个元素之间随机插入一个以上的值来获得大小为 50 的向量(例如,在 V[0] 和 V[1] 之间插入 3 个值,在 V[3] 和V[4] 插入 1 个值等)。这可能吗?
能否请您指导我如何执行此操作?
谢谢。
所以我自己做了一些数学运算,因为我很好奇如何获得权重比(就像线性上采样到公倍数,然后只从大数组中提取目标值——但没有创建大数组,只是使用权重以了解左+右元素对特定值的贡献)。
示例代码确实总是通过简单加权平均创建新值(即 123.4 的 40% 和 567.8 的 60% 将给出 "upscaled" 值 390.04),没有随机用于填充放大值(留下那部分到 OP).
比率是这样的:
如果大小为 M 的向量被放大到大小 N (M <= N)("upscale" 将始终保留输入向量的第一个和最后一个元素,在本提案中为 "fixed")
那么每个放大的元素都可以被视为介于某些原始元素 [i, i+1] 之间的某个位置。
如果我们声明源元素 [i, i+1] 之间的 "distance" 等于 d = N-1,那么放大的元素之间的距离总是可以表示为一些 j/d 其中 j:[0,d] (当 j 是实际的 d,它恰好在 "i+1" 元素处,可以认为与 j=0 相同,但 [i +1,i+2] 源元素)
然后两个放大的元素之间的距离是M-1。
因此,当源向量大小为 4,放大向量大小应为 5 时,放大元素的比率为 [ [4/4,0/4], [1/4,3/4], [2 /4,2/4], [3/4,1/4], [0/4,4/4] ] 元素(向量中的索引)[ [0,1], [0,1], [1 ,2], [2, 3], [2, 3]]。
(源元素之间的 "distance" 是 5-1=4,这就是用于标准化权重的“/4”,放大元素之间的 "distance" 是 4-1=3,这就是比率移动 [ -3,+3] 每一步)。
恐怕我的描述与 "obvious" 相去甚远(弄清楚后我脑子里的感觉),但如果你能把其中的一些放到电子表格中并摆弄一下,希望如此这将是有道理的。或者,也许您可以调试代码以更好地感受上面的咕哝是如何转换为真实代码的。
代码示例 1,只有当权重精确地完全放在它上面时,这个才会 "copy" 源元素(即在示例数据中只有第一个和最后一个元素是 "copied",其余的是放大的元素是原始值的加权平均值)。
#include <iostream>
#include <vector>
#include <cassert>
static double get_upscale_value(const size_t total_weight, const size_t right_weight, const double left, const double right) {
// do the simple weighted average for demonstration purposes
const size_t left_weight = total_weight - right_weight;
return (left * left_weight + right * right_weight) / total_weight;
}
std::vector<double> upsample_weighted(std::vector<double>& in, size_t n)
{
assert( 2 <= in.size() && in.size() <= n ); // this is really only upscaling (can't downscale)
// resulting vector variable
std::vector<double> upscaled;
upscaled.reserve(n);
// upscaling factors variables and constants
size_t index_left = 0; // first "left" item is the in[0] element
size_t weight_right = 0; // and "right" has zero weight (i.e. in[0] is copied)
const size_t in_weight = n - 1; // total weight of single "in" element
const size_t weight_add = in.size() - 1; // shift of weight between "upscaled" elements
while (upscaled.size() < n) { // add N upscaled items
if (0 == weight_right) {
// full weight of left -> just copy it (never tainted by "upscaling")
upscaled.push_back(in[index_left]);
} else {
// the weight is somewhere between "left" and "right" items of "in" vector
// i.e. weight = 1..(in_weight-1) ("in_weight" is full "right" value, never happens)
double upscaled_val = get_upscale_value(in_weight, weight_right, in[index_left], in[index_left+1]);
upscaled.push_back(upscaled_val);
}
weight_right += weight_add;
if (in_weight <= weight_right) {
// the weight shifted so much that "right" is new "left"
++index_left;
weight_right -= in_weight;
}
}
return upscaled;
}
int main(int argc, const char *argv[])
{
std::vector<double> in { 10, 20, 30 };
// std::vector<double> in { 20, 10, 40 };
std::vector<double> upscaled = upsample_weighted(in, 14);
std::cout << "upsample_weighted from " << in.size() << " to " << upscaled.size() << ": ";
for (const auto i : upscaled) {
std::cout << i << " ";
}
std::cout << std::endl;
return 0;
}
输出:
upsample_weighted from 3 to 14: 10 11.5385 13.0769 14.6154 16.1538 17.6923 19.2308 20.7692 22.3077 23.8462 25.3846 26.9231 28.4615 30
代码示例 2,此示例将 "copy" 每个源元素并仅使用加权平均来填补之间的空白,因此保留了尽可能多的原始数据(结果的代价是不原始数据集的线性放大,但 "aliased" 到 "grid" 由目标大小定义):
(代码与第一个几乎相同,除了升级器中的 if
行)
#include <iostream>
#include <vector>
#include <cassert>
static double get_upscale_value(const size_t total_weight, const size_t right_weight, const double left, const double right) {
// do the simple weighted average for demonstration purposes
const size_t left_weight = total_weight - right_weight;
return (left * left_weight + right * right_weight) / total_weight;
}
// identical to "upsample_weighted", except all source values from "in" are copied into result
// and only extra added values (to make the target size) are generated by "get_upscale_value"
std::vector<double> upsample_copy_preferred(std::vector<double>& in, size_t n)
{
assert( 2 <= in.size() && in.size() <= n ); // this is really only upscaling (can't downscale)
// resulting vector variable
std::vector<double> upscaled;
upscaled.reserve(n);
// upscaling factors variables and constants
size_t index_left = 0; // first "left" item is the in[0] element
size_t weight_right = 0; // and "right" has zero weight (i.e. in[0] is copied)
const size_t in_weight = n - 1; // total weight of single "in" element
const size_t weight_add = in.size() - 1; // shift of weight between "upscaled" elements
while (upscaled.size() < n) { // add N upscaled items
/* ! */ if (weight_right < weight_add) { /* ! this line is modified */
// most of the weight on left -> copy it (don't taint it by upscaling)
upscaled.push_back(in[index_left]);
} else {
// the weight is somewhere between "left" and "right" items of "in" vector
// i.e. weight = 1..(in_weight-1) ("in_weight" is full "right" value, never happens)
double upscaled_val = get_upscale_value(in_weight, weight_right, in[index_left], in[index_left+1]);
upscaled.push_back(upscaled_val);
}
weight_right += weight_add;
if (in_weight <= weight_right) {
// the weight shifted so much that "right" is new "left"
++index_left;
weight_right -= in_weight;
}
}
return upscaled;
}
int main(int argc, const char *argv[])
{
std::vector<double> in { 10, 20, 30 };
// std::vector<double> in { 20, 10, 40 };
std::vector<double> upscaled = upsample_copy_preferred(in, 14);
std::cout << "upsample_copy_preferred from " << in.size() << " to " << upscaled.size() << ": ";
for (const auto i : upscaled) {
std::cout << i << " ";
}
std::cout << std::endl;
return 0;
}
输出:
upsample_copy_preferred from 3 to 14: 10 11.5385 13.0769 14.6154 16.1538 17.6923 19.2308 20 22.3077 23.8462 25.3846 26.9231 28.4615 30
(请注意示例 1 中的“20.7692”在这里只是“20”——原始样本的副本,即使此时考虑线性插值时“30”的权重较小)
假设我们有一个由 20 个浮点数组成的向量 V。是否可以在每对这些浮点数之间插入值,使向量 V 成为恰好包含 50 个数字的向量。
插入的值应该是介于上限值和下限值之间的随机数我决定在两者之间插入两个值的中点。
我尝试了以下方法:
vector<double> upsample(vector<double>& in)
{
vector<double> temp;
for (int i = 1; i <= in.size() - 1 ; i++)
{
double sample = (in[i] + in[i - 1]) / 2;
temp.push_back(in[i - 1]);
temp.push_back(sample);
}
temp.push_back(in.back());
return temp;
}
使用此函数,输入向量元素增加 2(n) - 1(20 个元素变为 39)。输入向量的不同大小可能小于 50。
我认为可以通过在两个元素之间随机插入一个以上的值来获得大小为 50 的向量(例如,在 V[0] 和 V[1] 之间插入 3 个值,在 V[3] 和V[4] 插入 1 个值等)。这可能吗?
能否请您指导我如何执行此操作? 谢谢。
所以我自己做了一些数学运算,因为我很好奇如何获得权重比(就像线性上采样到公倍数,然后只从大数组中提取目标值——但没有创建大数组,只是使用权重以了解左+右元素对特定值的贡献)。
示例代码确实总是通过简单加权平均创建新值(即 123.4 的 40% 和 567.8 的 60% 将给出 "upscaled" 值 390.04),没有随机用于填充放大值(留下那部分到 OP).
比率是这样的:
如果大小为 M 的向量被放大到大小 N (M <= N)("upscale" 将始终保留输入向量的第一个和最后一个元素,在本提案中为 "fixed")
那么每个放大的元素都可以被视为介于某些原始元素 [i, i+1] 之间的某个位置。
如果我们声明源元素 [i, i+1] 之间的 "distance" 等于 d = N-1,那么放大的元素之间的距离总是可以表示为一些 j/d 其中 j:[0,d] (当 j 是实际的 d,它恰好在 "i+1" 元素处,可以认为与 j=0 相同,但 [i +1,i+2] 源元素)
然后两个放大的元素之间的距离是M-1。
因此,当源向量大小为 4,放大向量大小应为 5 时,放大元素的比率为 [ [4/4,0/4], [1/4,3/4], [2 /4,2/4], [3/4,1/4], [0/4,4/4] ] 元素(向量中的索引)[ [0,1], [0,1], [1 ,2], [2, 3], [2, 3]]。 (源元素之间的 "distance" 是 5-1=4,这就是用于标准化权重的“/4”,放大元素之间的 "distance" 是 4-1=3,这就是比率移动 [ -3,+3] 每一步)。
恐怕我的描述与 "obvious" 相去甚远(弄清楚后我脑子里的感觉),但如果你能把其中的一些放到电子表格中并摆弄一下,希望如此这将是有道理的。或者,也许您可以调试代码以更好地感受上面的咕哝是如何转换为真实代码的。
代码示例 1,只有当权重精确地完全放在它上面时,这个才会 "copy" 源元素(即在示例数据中只有第一个和最后一个元素是 "copied",其余的是放大的元素是原始值的加权平均值)。
#include <iostream>
#include <vector>
#include <cassert>
static double get_upscale_value(const size_t total_weight, const size_t right_weight, const double left, const double right) {
// do the simple weighted average for demonstration purposes
const size_t left_weight = total_weight - right_weight;
return (left * left_weight + right * right_weight) / total_weight;
}
std::vector<double> upsample_weighted(std::vector<double>& in, size_t n)
{
assert( 2 <= in.size() && in.size() <= n ); // this is really only upscaling (can't downscale)
// resulting vector variable
std::vector<double> upscaled;
upscaled.reserve(n);
// upscaling factors variables and constants
size_t index_left = 0; // first "left" item is the in[0] element
size_t weight_right = 0; // and "right" has zero weight (i.e. in[0] is copied)
const size_t in_weight = n - 1; // total weight of single "in" element
const size_t weight_add = in.size() - 1; // shift of weight between "upscaled" elements
while (upscaled.size() < n) { // add N upscaled items
if (0 == weight_right) {
// full weight of left -> just copy it (never tainted by "upscaling")
upscaled.push_back(in[index_left]);
} else {
// the weight is somewhere between "left" and "right" items of "in" vector
// i.e. weight = 1..(in_weight-1) ("in_weight" is full "right" value, never happens)
double upscaled_val = get_upscale_value(in_weight, weight_right, in[index_left], in[index_left+1]);
upscaled.push_back(upscaled_val);
}
weight_right += weight_add;
if (in_weight <= weight_right) {
// the weight shifted so much that "right" is new "left"
++index_left;
weight_right -= in_weight;
}
}
return upscaled;
}
int main(int argc, const char *argv[])
{
std::vector<double> in { 10, 20, 30 };
// std::vector<double> in { 20, 10, 40 };
std::vector<double> upscaled = upsample_weighted(in, 14);
std::cout << "upsample_weighted from " << in.size() << " to " << upscaled.size() << ": ";
for (const auto i : upscaled) {
std::cout << i << " ";
}
std::cout << std::endl;
return 0;
}
输出:
upsample_weighted from 3 to 14: 10 11.5385 13.0769 14.6154 16.1538 17.6923 19.2308 20.7692 22.3077 23.8462 25.3846 26.9231 28.4615 30
代码示例 2,此示例将 "copy" 每个源元素并仅使用加权平均来填补之间的空白,因此保留了尽可能多的原始数据(结果的代价是不原始数据集的线性放大,但 "aliased" 到 "grid" 由目标大小定义):
(代码与第一个几乎相同,除了升级器中的 if
行)
#include <iostream>
#include <vector>
#include <cassert>
static double get_upscale_value(const size_t total_weight, const size_t right_weight, const double left, const double right) {
// do the simple weighted average for demonstration purposes
const size_t left_weight = total_weight - right_weight;
return (left * left_weight + right * right_weight) / total_weight;
}
// identical to "upsample_weighted", except all source values from "in" are copied into result
// and only extra added values (to make the target size) are generated by "get_upscale_value"
std::vector<double> upsample_copy_preferred(std::vector<double>& in, size_t n)
{
assert( 2 <= in.size() && in.size() <= n ); // this is really only upscaling (can't downscale)
// resulting vector variable
std::vector<double> upscaled;
upscaled.reserve(n);
// upscaling factors variables and constants
size_t index_left = 0; // first "left" item is the in[0] element
size_t weight_right = 0; // and "right" has zero weight (i.e. in[0] is copied)
const size_t in_weight = n - 1; // total weight of single "in" element
const size_t weight_add = in.size() - 1; // shift of weight between "upscaled" elements
while (upscaled.size() < n) { // add N upscaled items
/* ! */ if (weight_right < weight_add) { /* ! this line is modified */
// most of the weight on left -> copy it (don't taint it by upscaling)
upscaled.push_back(in[index_left]);
} else {
// the weight is somewhere between "left" and "right" items of "in" vector
// i.e. weight = 1..(in_weight-1) ("in_weight" is full "right" value, never happens)
double upscaled_val = get_upscale_value(in_weight, weight_right, in[index_left], in[index_left+1]);
upscaled.push_back(upscaled_val);
}
weight_right += weight_add;
if (in_weight <= weight_right) {
// the weight shifted so much that "right" is new "left"
++index_left;
weight_right -= in_weight;
}
}
return upscaled;
}
int main(int argc, const char *argv[])
{
std::vector<double> in { 10, 20, 30 };
// std::vector<double> in { 20, 10, 40 };
std::vector<double> upscaled = upsample_copy_preferred(in, 14);
std::cout << "upsample_copy_preferred from " << in.size() << " to " << upscaled.size() << ": ";
for (const auto i : upscaled) {
std::cout << i << " ";
}
std::cout << std::endl;
return 0;
}
输出:
upsample_copy_preferred from 3 to 14: 10 11.5385 13.0769 14.6154 16.1538 17.6923 19.2308 20 22.3077 23.8462 25.3846 26.9231 28.4615 30
(请注意示例 1 中的“20.7692”在这里只是“20”——原始样本的副本,即使此时考虑线性插值时“30”的权重较小)