在某些平台上损坏的 OpenGL 渲染
Mangled OpenGL rendering on some platforms
我一直在尝试批处理精灵渲染,我找到了一个在我的台式电脑上运行良好的解决方案。但是,在我的集成英特尔 UHD 620 笔记本电脑上试用它时,我收到以下性能警告:
[21:42:03 error] OpenGL: API - Performance - Recompiling fragment shader for program 27
[21:42:03 error] OpenGL: API - Performance - multisampled FBO 0->1
大概是因为这些性能警告的来源,在我的专用显卡机器上需要 1-2 毫秒的帧在我的笔记本电脑上大约需要 100 毫秒。
这是我的渲染器代码:
BatchedSpriteRenderer::BatchedSpriteRenderer(ResourceManager &resource_manager)
: resource_manager(&resource_manager),
max_sprites(100000),
vertex_array(std::make_unique<VertexArray>()),
vertex_buffer(std::make_unique<VertexBuffer>())
{
resource_manager.load_shader("batched_texture",
"shaders/texture_batched.vert",
"shaders/texture.frag");
std::vector<unsigned int> sprite_indices;
for (int i = 0; i < max_sprites; ++i)
{
unsigned int sprite_number = i * 4;
sprite_indices.push_back(0 + sprite_number);
sprite_indices.push_back(1 + sprite_number);
sprite_indices.push_back(2 + sprite_number);
sprite_indices.push_back(2 + sprite_number);
sprite_indices.push_back(3 + sprite_number);
sprite_indices.push_back(0 + sprite_number);
}
element_buffer = std::make_unique<ElementBuffer>(sprite_indices.data(), max_sprites * 6);
VertexBufferLayout layout;
layout.push<float>(2);
layout.push<float>(2);
layout.push<float>(4);
vertex_array->add_buffer(*vertex_buffer, layout);
}
void BatchedSpriteRenderer::draw(const std::string &texture,
const std::vector<glm::mat4> &transforms,
const glm::mat4 &view)
{
vertex_array->bind();
auto shader = resource_manager->shader_store.get("batched_texture");
shader->bind();
std::vector<SpriteVertex> vertices;
vertices.reserve(transforms.size() * 4);
for (const auto &transform : transforms)
{
glm::vec4 transformed_position = transform * glm::vec4(0.0, 1.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(0.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
transformed_position = transform * glm::vec4(0.0, 0.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(0.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
transformed_position = transform * glm::vec4(1.0, 0.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(1.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
transformed_position = transform * glm::vec4(1.0, 1.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(1.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
}
vertex_buffer->add_data(vertices.data(),
sizeof(SpriteVertex) * vertices.size(),
GL_DYNAMIC_DRAW);
shader->set_uniform_mat4f("u_view", view);
shader->set_uniform_1i("u_texture", 0);
resource_manager->texture_store.get(texture)->bind();
glDrawElements(GL_TRIANGLES, transforms.size() * 6, GL_UNSIGNED_INT, 0);
}
希望我的抽象应该是相当自我解释的。每个抽象 类(VertexArray
、VertexBuffer
、ElementBuffer
、VertexBufferLayout
)管理它们等效的 OpenGL 对象的生命周期。
以下是正在使用的着色器:
texture_batched.vert
#version 430 core
layout(location = 0)in vec2 v_position;
layout(location = 1)in vec2 v_tex_coord;
layout(location = 2)in vec4 v_color;
out vec4 color;
out vec2 tex_coord;
uniform mat4 u_view;
void main()
{
tex_coord = v_tex_coord;
gl_Position = u_view * vec4(v_position, 0.0, 1.0);
color = v_color;
}
texture.frag
#version 430 core
in vec4 color;
in vec2 tex_coord;
out vec4 frag_color;
uniform sampler2D u_texture;
void main()
{
frag_color = texture(u_texture, tex_coord);
frag_color *= color;
}
是什么导致了这些性能问题,我该如何解决?
编辑:我完全忘记了用它渲染的实际图像完全搞砸了,我会在我的台式电脑上尝试抓取它正常工作的屏幕截图,但这是损坏的版本看起来像:
它应该是一个由 200x200 白色圆圈组成的整齐的网格。
编辑 2:我在另一台电脑上试过,这次是 GTX 1050 Ti,它也坏了。这次没有错误消息或警告。警告可能无关。
据我所知,它最终与 OpenGL 无关。
在 draw 函数中,我创建了一个名为 vertices
的矢量,然后将所有顶点放入其中。出于某种原因,当我每帧重新创建该向量时,以下 push_back
调用未正确添加到向量中。 SpriteVertex
结构的成员被混淆了。因此,而不是正确的布局:
pos tex_coord color
pos tex_coord color
pos tex_coord color
pos tex_coord color
正在按以下布局填充:
pos tex_coord color
tex_coord pos color
tex_coord pos color
tex_coord pos color
或者至少是那种效果。
我更改了它,使 vertices
向量成为 BatchedSpriteRenderer
class 的成员,为最大可能的顶点数保留 space。
void BatchedSpriteRenderer::draw(const std::string &texture,
const std::vector<glm::mat4> &transforms,
const glm::mat4 &view)
{
vertex_array->bind();
auto shader = resource_manager->shader_store.get("batched_texture");
shader->bind();
for (unsigned int i = 0; i < transforms.size(); ++i)
{
const auto &transform = transforms[i];
glm::vec4 transformed_position = transform * glm::vec4(0.0, 1.0, 1.0, 1.0);
vertices[i * 4] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(0.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
transformed_position = transform * glm::vec4(0.0, 0.0, 1.0, 1.0);
vertices[i * 4 + 1] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(0.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
transformed_position = transform * glm::vec4(1.0, 0.0, 1.0, 1.0);
vertices[i * 4 + 2] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(1.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
transformed_position = transform * glm::vec4(1.0, 1.0, 1.0, 1.0);
vertices[i * 4 + 3] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(1.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
}
vertex_buffer->add_data(vertices.data(),
sizeof(SpriteVertex) * (transforms.size() * 4),
GL_DYNAMIC_DRAW);
shader->set_uniform_mat4f("u_view", view);
shader->set_uniform_1i("u_texture", 0);
resource_manager->texture_store.get(texture)->bind();
glDrawElements(GL_TRIANGLES, transforms.size() * 6, GL_UNSIGNED_INT, 0);
}
我一直在尝试批处理精灵渲染,我找到了一个在我的台式电脑上运行良好的解决方案。但是,在我的集成英特尔 UHD 620 笔记本电脑上试用它时,我收到以下性能警告:
[21:42:03 error] OpenGL: API - Performance - Recompiling fragment shader for program 27
[21:42:03 error] OpenGL: API - Performance - multisampled FBO 0->1
大概是因为这些性能警告的来源,在我的专用显卡机器上需要 1-2 毫秒的帧在我的笔记本电脑上大约需要 100 毫秒。
这是我的渲染器代码:
BatchedSpriteRenderer::BatchedSpriteRenderer(ResourceManager &resource_manager)
: resource_manager(&resource_manager),
max_sprites(100000),
vertex_array(std::make_unique<VertexArray>()),
vertex_buffer(std::make_unique<VertexBuffer>())
{
resource_manager.load_shader("batched_texture",
"shaders/texture_batched.vert",
"shaders/texture.frag");
std::vector<unsigned int> sprite_indices;
for (int i = 0; i < max_sprites; ++i)
{
unsigned int sprite_number = i * 4;
sprite_indices.push_back(0 + sprite_number);
sprite_indices.push_back(1 + sprite_number);
sprite_indices.push_back(2 + sprite_number);
sprite_indices.push_back(2 + sprite_number);
sprite_indices.push_back(3 + sprite_number);
sprite_indices.push_back(0 + sprite_number);
}
element_buffer = std::make_unique<ElementBuffer>(sprite_indices.data(), max_sprites * 6);
VertexBufferLayout layout;
layout.push<float>(2);
layout.push<float>(2);
layout.push<float>(4);
vertex_array->add_buffer(*vertex_buffer, layout);
}
void BatchedSpriteRenderer::draw(const std::string &texture,
const std::vector<glm::mat4> &transforms,
const glm::mat4 &view)
{
vertex_array->bind();
auto shader = resource_manager->shader_store.get("batched_texture");
shader->bind();
std::vector<SpriteVertex> vertices;
vertices.reserve(transforms.size() * 4);
for (const auto &transform : transforms)
{
glm::vec4 transformed_position = transform * glm::vec4(0.0, 1.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(0.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
transformed_position = transform * glm::vec4(0.0, 0.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(0.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
transformed_position = transform * glm::vec4(1.0, 0.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(1.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
transformed_position = transform * glm::vec4(1.0, 1.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(1.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
}
vertex_buffer->add_data(vertices.data(),
sizeof(SpriteVertex) * vertices.size(),
GL_DYNAMIC_DRAW);
shader->set_uniform_mat4f("u_view", view);
shader->set_uniform_1i("u_texture", 0);
resource_manager->texture_store.get(texture)->bind();
glDrawElements(GL_TRIANGLES, transforms.size() * 6, GL_UNSIGNED_INT, 0);
}
希望我的抽象应该是相当自我解释的。每个抽象 类(VertexArray
、VertexBuffer
、ElementBuffer
、VertexBufferLayout
)管理它们等效的 OpenGL 对象的生命周期。
以下是正在使用的着色器:
texture_batched.vert
#version 430 core
layout(location = 0)in vec2 v_position;
layout(location = 1)in vec2 v_tex_coord;
layout(location = 2)in vec4 v_color;
out vec4 color;
out vec2 tex_coord;
uniform mat4 u_view;
void main()
{
tex_coord = v_tex_coord;
gl_Position = u_view * vec4(v_position, 0.0, 1.0);
color = v_color;
}
texture.frag
#version 430 core
in vec4 color;
in vec2 tex_coord;
out vec4 frag_color;
uniform sampler2D u_texture;
void main()
{
frag_color = texture(u_texture, tex_coord);
frag_color *= color;
}
是什么导致了这些性能问题,我该如何解决?
编辑:我完全忘记了用它渲染的实际图像完全搞砸了,我会在我的台式电脑上尝试抓取它正常工作的屏幕截图,但这是损坏的版本看起来像:
它应该是一个由 200x200 白色圆圈组成的整齐的网格。
编辑 2:我在另一台电脑上试过,这次是 GTX 1050 Ti,它也坏了。这次没有错误消息或警告。警告可能无关。
据我所知,它最终与 OpenGL 无关。
在 draw 函数中,我创建了一个名为 vertices
的矢量,然后将所有顶点放入其中。出于某种原因,当我每帧重新创建该向量时,以下 push_back
调用未正确添加到向量中。 SpriteVertex
结构的成员被混淆了。因此,而不是正确的布局:
pos tex_coord color
pos tex_coord color
pos tex_coord color
pos tex_coord color
正在按以下布局填充:
pos tex_coord color
tex_coord pos color
tex_coord pos color
tex_coord pos color
或者至少是那种效果。
我更改了它,使 vertices
向量成为 BatchedSpriteRenderer
class 的成员,为最大可能的顶点数保留 space。
void BatchedSpriteRenderer::draw(const std::string &texture,
const std::vector<glm::mat4> &transforms,
const glm::mat4 &view)
{
vertex_array->bind();
auto shader = resource_manager->shader_store.get("batched_texture");
shader->bind();
for (unsigned int i = 0; i < transforms.size(); ++i)
{
const auto &transform = transforms[i];
glm::vec4 transformed_position = transform * glm::vec4(0.0, 1.0, 1.0, 1.0);
vertices[i * 4] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(0.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
transformed_position = transform * glm::vec4(0.0, 0.0, 1.0, 1.0);
vertices[i * 4 + 1] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(0.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
transformed_position = transform * glm::vec4(1.0, 0.0, 1.0, 1.0);
vertices[i * 4 + 2] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(1.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
transformed_position = transform * glm::vec4(1.0, 1.0, 1.0, 1.0);
vertices[i * 4 + 3] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(1.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
}
vertex_buffer->add_data(vertices.data(),
sizeof(SpriteVertex) * (transforms.size() * 4),
GL_DYNAMIC_DRAW);
shader->set_uniform_mat4f("u_view", view);
shader->set_uniform_1i("u_texture", 0);
resource_manager->texture_store.get(texture)->bind();
glDrawElements(GL_TRIANGLES, transforms.size() * 6, GL_UNSIGNED_INT, 0);
}