4

集成 upb 和 lua binding 的踩坑小计

 2 years ago
source link: https://owent.net/2022/2207.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

集成 upb 和 lua binding 的踩坑小计


最近新项目重新评估了一下protobuf的C/C++ -> Lua binding 方案。之前,使用最广泛的 Lua binding 方案应该是 云风 大大的 pbc 。但是这个库已经是作者弃坑好多年的状态了。我之前使用 pbc 的时候刚碰上 protobuf 3.0 刚出来,当时打了patch来适配 protobuf 3.0 ,还修复了一些其他问题。这个Patch有些推给了上游,有些因为和上游的某些机制冲突没有推。我了解到的很多其他项目也或多或少的打了自己的Patch,大多数也没往上游推。基本上 pbc 已经处于一个失维的状态,所以这次新项目就干脆来寻求更好,或者说仍然有良好活跃度的解决方案。于是就看向了 upb

upb 是一个 protobuf 官方的子项目,运行时使用纯C编写。代码生成工具使用C++,依赖 libprotoc 和 abseil-cpp 。并且 upb 是现在protobuf的Ruby, PHP和Python的官方运行时库,里面也包含了Lua binding模块。这样不仅技术支持有保障(有主流项目踩坑),并且能跟进protobuf的新功能。 在Lua binding方面,我们还有一些特殊的地方。因为编译的时候需要包含lua的头文件和设置链接库,而我们需要支持各种不同的lua运行时。(比如我们客户端用Unlua并集成到Unreal Engine,服务器用官方的lua 5.4,我们这边还有项目组定制化了Lua抹除了浮点数)。所以对于Lua binding,我们不能用预编译包的方式发布,只能用源码发布。所幸涉及的文件不多,还比较好处理。

目前看起来 upb 的工程管理还不是很规范,bazel构建系统可能用起来还比较简单,我们项目组主要使用cmake,就需要适配一下。实际上 upb 提供了工具通过bazel来生成cmake的工程文件,只是生成的文件有一些问题,那么本文主要记录一下适配过程中踩的一些坑和我们的解决方案。

Lua 5.3+ 版本适配

第一个问题是最简单最容易解决的,upb 的Lua binding代码虽然溜了一些适配代码。不过我估计没人测试过。有一些头文件和函数名使用时有问题的。这类版本问题都比较好处理,适配一下就行了。 这个Lua版本兼容的问题已经提 issue 到 https://github.com/protocolbuffers/upb/issues/734 。解决方案的PR也已经推到 https://github.com/protocolbuffers/upb/pull/735 。等啥时候和入了应该就好了。

一处C库的未定义行为调用

在 Lua binding 的代码中( upb/bindings/lua/upbc.cc ) 有一个地方是这么写的。

static void PrintString(int max_cols, absl::string_view* str,
                        protobuf::io::Printer* printer) {
  printer->Print("\'");
  while (max_cols > 0 && !str->empty()) {
    char ch = (*str)[0];
    if (ch == '\\') {
      printer->PrintRaw("\\\\");
      max_cols--;
    } else if (ch == '\'') {
      printer->PrintRaw("\\'");
      max_cols--;
    } else if (isprint(ch)) {
      printer->WriteRaw(&ch, 1);
      max_cols--;
    } else {
      unsigned char byte = ch;
      printer->PrintRaw("\\x");
      PrintHexDigit(byte >> 4, printer);
      PrintHexDigit(byte & 15, printer);
      max_cols -= 4;
    }
    str->remove_prefix(1);
  }
  printer->Print("\'");
}

这里 isprint(ch)ch 是一个 char 。而按照 https://en.cppreference.com/w/c/string/byte/isprint 的说法。这个函数接受的参数虽然是个 int 但它必须是个 unsigned char 或者 EOF 。否则是UB。实际上,这里对于utf-8字符串或者unicode,这里取到的 ch 可能为负数,于是隐式转换成 int 后不是一个 unsigned char 。在MSVC下此处会崩溃。最简单呃方法是把 isprint(ch) 改成 isprint(static_cast<int>(static_cast<unsigned char>(ch)))

cmake生成工具的问题

第二个问题是新版本的 upb 中移除了 CMakeLists.txt 文件。按我和社区讨论的说法是本来应该仓库里提供这个文件的,不知为何最新版本消失了。其实 upb 是提供了工具,通过Bazel工程生成 CMakeLists.txt 文件的,但是生成的 CMakeLists.txt 文件有两个问题。第一个问题是新增的依赖库 utf8_range 中,链接的target的名字带了 / ,因为它在 Bazel中的target名字是 //third_party/utf8_range:utf8_range 。第二个问题是它没有生成 export 配置。导致我们无法使用cmake自带的 find_package 功能自动设置依赖包含目录和自动按需链接库。而 upb 内部模块还比较多,管理起来还比较麻烦。

对于第二个问题, vcpkg 的做法是打了个巨大的patch。 https://github.com/microsoft/vcpkg/blob/master/ports/upb/0001-make-cmakelists-py.patch 。而我也一样打了个更大的Patch,解决了这两个问题,和前面提到的Lua 5.3+ 版本适配和未定义行为调用的问题。

grpc 支持

非常悲伤的是 grpc 也依赖 upb ,并且它是以源码subtree的方式引入的,grpc 自己写了编译脚本,仅仅引入了运行时。显然如果我们想同时使用 grpcupb 的Lua binding。我们就必须编译出 upbprotoc-gen-lua 插件,并且使用同一版本。这导致我不得不写两份Patch。一份在 grpc 里,一份在 upb 里。

这两份Patch都写在之前我写的构建系统 cmake-toolset 里了, 地址在 https://github.com/atframework/cmake-toolset/tree/main/ports/grpc ,有需要的小伙伴可以自取。

Lua binding和测试小工具

也是为了方便测试,我在 cmake-toolset 构建系统的Test里写了个小工具,可以加载 upb 的Lua binding然后直接命令行做测试。 只要clone cmake-toolset 之后,使用 bash ci/do_ci.sh gcc.standalone-upb.testpwsh -File ci/do_ci.ps1 msvc.standalone-upb.test 就可以生成。 然后通过 cmake-toolset-upb-and-lua-binding-test [Lua查找目录或要执行的Lua文件...] 来执行指定Lua脚本。

xresloader 读表工具集成

使用上面的测试小工具,我顺带做了对我们项目组使用的转表工具 xresloader 所生成的二进制文件的读表支持。 还是通过 xres-code-generator 来自动生成加载代码,和C++接口与纯Lua接口一样,支持多版本管理、多重索引( Key-Value 型和 Key-List 型 )和lazy load。

比如在以下proto中:

syntax = "proto3";

import "xresloader.proto";
import "xresloader_ue.proto";

import "xrescode_extensions_v3.proto";

message role_upgrade_cfg {
    option (org.xresloader.ue.helper)       = "helper";
    option (org.xresloader.msg_description) = "Test role_upgrade_cfg with multi keys";

    option (xrescode.loader) = {
        file_path : "role_upgrade_cfg.bytes"
        indexes : {
            // name: "id"
            fields : "Id"
            index_type : EN_INDEX_KL // Key - List index: (Id) => list<role_upgrade_cfg>
        }
        indexes : {
            // name: "id_level" // 默认索引名字为所有的field名转小写用下划线连接
            fields : "Id"
            fields : "Level"
            index_type : EN_INDEX_KV // Key - Value index: (Id, Level) => role_upgrade_cfg
        }
        tags : "client"
        tags : "server"
    };

    uint32 Id        = 1 [ (org.xresloader.ue.key_tag) = 1000 ];
    uint32 Level     = 2 [ (org.xresloader.ue.key_tag) = 1 ];
    uint32 CostType  = 3 [ (org.xresloader.verifier) = "cost_type", (org.xresloader.field_description) = "Refer to cost_type" ];
    int64  CostValue = 4;
    int32  ScoreAdd  = 5;
}

生成的代码使用方式如下:

-- We will use require(...) to load
--   - DataTableServiceUpb
--   - DataTableCustomIndexUpb
--   - xrescode_extensions_v3_pb
--   - pb_header_v3_pb
--   - upb
--   - google/protobuf/descriptor_pb
--   - Other custom proto files generated by protoc-gen-lua
-- Please ensure these can be load by require(FILE_PATH)
local excel_config_service = require("DataTableServiceUpb")
local upb = require("upb")

-- Set logger
-- excel_config_service:OnError = function (message, data_set, indexName, keys...) end
-- excel_config_service:OnInfo = function (message, data_set, indexName) end

excel_config_service:ReloadTables()

-- 获取索引的message名字
local role_upgrade_cfg = excel_config_service:Get("role_upgrade_cfg")
print("======================= Lazy load begin =======================")
-- 按(索引名, key...) 获取Excel中一行
local data = role_upgrade_cfg:GetByIndex("id_level", 10001, 3) -- using the Key-Value index: id_level
print("======================= Lazy load end =======================")

print("----------------------- Get by Key-Value index -----------------------")
-- upb自带序列化成json的功能
print(string.format("Data of role_upgrade_cfg: id=10001, level=3 -> json_encode: %s",
    upb.json_encode(data, { upb.JSONENC_PROTONAMES })))

print("----------------------- Get by reflection and Key-List index -----------------------")
-- 获取当前配置分组,也就是当前版本。当然也可以获取其他版本
local current_group = excel_config_service:GetCurrentGroup()
-- 获取message名字和索引的名字获取
local role_upgrade_cfg2 = excel_config_service:GetByGroup(current_group, "role_upgrade_cfg")
local data2 = role_upgrade_cfg2:GetByIndex("id", 10001) -- using the Key-List index: id
for _, v1 in ipairs(data2) do
    print(string.format("\tid: %s, level: %s", tostring(v1.Id), tostring(v1.Level)))
    -- 此处是反射接口测试
    for fds in role_upgrade_cfg2:GetMessageDescriptor():fields() do
        print(string.format("\t\t%s=%s", fds:name(), tostring(v1[fds:name()])))
    end
end

记录一个Python的crash

在我写一个Python工具的时候碰到一个和 upb 有关的Crash BUG。其实应该是 protobuf的 python实现的BUG,不是 upb 的BUG,也记录在这里吧。

Crash位置:

#0  0x00007fd147b64b20 in upb_Array_Size () from /home/owent/workspace/tencent/pix/server/third_party/install/linux-x86_64-gnu/.modules/lib/python3.9/site-packages/google/_upb/_message.abi3.so
#1  0x00007fd147b51a79 in PyUpb_ExtensionDict_Contains () from /home/owent/workspace/tencent/pix/server/third_party/install/linux-x86_64-gnu/.modules/lib/python3.9/site-packages/google/_upb/_message.abi3.so
#2  0x00007fd14f3621d0 in _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x7fd14753fa40, throwflag=<optimized out>) at Python/ceval.c:3036
#3  0x00007fd14f30df02 in _PyEval_EvalFrame (throwflag=0, f=0x7fd14753fa40, tstate=0x89e150) at ./Include/internal/pycore_ceval.h:40
#4  function_code_fastcall (tstate=0x89e150, co=<optimized out>, args=<optimized out>, nargs=4, globals=<optimized out>) at Objects/call.c:330
#5  0x00007fd14f3617d6 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0xca1e68, callable=0x7fd1473a8700, tstate=0x89e150) at ./Include/cpython/abstract.h:118
#6  PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0xca1e68, callable=0x7fd1473a8700) at ./Include/cpython/abstract.h:127
#7  call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x89e150) at Python/ceval.c:5077
#8  _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0xca1c80, throwflag=<optimized out>) at Python/ceval.c:3506
#9  0x00007fd14f3606cf in _PyEval_EvalFrame (throwflag=0, f=0xca1c80, tstate=0x89e150) at ./Include/internal/pycore_ceval.h:40
#10 _PyEval_EvalCode (tstate=tstate@entry=0x89e150, _co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=<optimized out>, kwnames=0x7fd147c1a058, kwargs=0x7fd147d564a8, kwcount=7, kwstep=1, 
    defs=0x7fd147b88e38, defcount=7, kwdefs=0x0, closure=0x0, name=0x7fd14f5ca530, qualname=0x7fd147c27e40) at Python/ceval.c:4329
#11 0x00007fd14f30dd69 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:396
#12 0x00007fd14f30d27d in _PyObject_FastCallDictTstate (tstate=0x89e150, callable=0x7fd1473a88b0, args=<optimized out>, nargsf=<optimized out>, kwargs=<optimized out>) at Objects/call.c:129
#13 0x00007fd14f30e68b in _PyObject_Call_Prepend (tstate=tstate@entry=0x89e150, callable=callable@entry=0x7fd1473a88b0, obj=obj@entry=0x7fd147d0a4c0, args=args@entry=0x7fd147da0b20, kwargs=kwargs@entry=0x7fd147cdbac0) at Objects/call.c:489
#14 0x00007fd14f33f839 in slot_tp_init (self=0x7fd147d0a4c0, args=0x7fd147da0b20, kwds=0x7fd147cdbac0) at Objects/typeobject.c:6969
#15 0x00007fd14f33cd4b in type_call (type=<optimized out>, args=0x7fd147da0b20, kwds=0x7fd147cdbac0) at Objects/typeobject.c:1026
#16 0x00007fd14f30d553 in _PyObject_MakeTpCall (tstate=0x89e150, callable=0x91ad90, args=0x95ab48, nargs=<optimized out>, keywords=<optimized out>) at Objects/call.c:191
#17 0x00007fd14f365e20 in _PyObject_VectorcallTstate (kwnames=0x7fd147e45820, nargsf=<optimized out>, args=0x95ab48, callable=0x91ad90, tstate=<optimized out>) at ./Include/cpython/abstract.h:116
#18 _PyObject_VectorcallTstate (kwnames=0x7fd147e45820, nargsf=<optimized out>, args=0x95ab48, callable=0x91ad90, tstate=<optimized out>) at ./Include/cpython/abstract.h:103
#19 PyObject_Vectorcall (kwnames=0x7fd147e45820, nargsf=<optimized out>, args=0x95ab48, callable=0x91ad90) at ./Include/cpython/abstract.h:127
#20 call_function (kwnames=0x7fd147e45820, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=<optimized out>) at Python/ceval.c:5077
#21 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x95a910, throwflag=<optimized out>) at Python/ceval.c:3537
#22 0x00007fd14f3606cf in _PyEval_EvalFrame (throwflag=0, f=0x95a910, tstate=0x89e150) at ./Include/internal/pycore_ceval.h:40
#23 _PyEval_EvalCode (tstate=tstate@entry=0x89e150, _co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=<optimized out>, kwnames=0x0, kwargs=0x8f3190, kwcount=0, kwstep=1, defs=0x0, 
    defcount=0, kwdefs=0x0, closure=0x0, name=0x7fd147f565f0, qualname=0x7fd147f565f0) at Python/ceval.c:4329
#24 0x00007fd14f30dd69 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:396
#25 0x00007fd14f361550 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=9223372036854775808, args=0x8f3190, callable=0x7fd147ccfca0, tstate=0x89e150) at ./Include/cpython/abstract.h:118
#26 PyObject_Vectorcall (kwnames=0x0, nargsf=9223372036854775808, args=0x8f3190, callable=0x7fd147ccfca0) at ./Include/cpython/abstract.h:127
#27 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x89e150) at Python/ceval.c:5077
#28 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x8f3020, throwflag=<optimized out>) at Python/ceval.c:3520
#29 0x00007fd14f3606cf in _PyEval_EvalFrame (throwflag=0, f=0x8f3020, tstate=0x89e150) at ./Include/internal/pycore_ceval.h:40
#30 _PyEval_EvalCode (tstate=0x89e150, _co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=0x0, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, 
    closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4329
#31 0x00007fd14f360382 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, 
    defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4361
#32 0x00007fd14f360323 in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kws=<optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0)
    at Python/ceval.c:4377
#33 0x00007fd14f3d20fb in PyEval_EvalCode (co=co@entry=0x7fd147db1500, globals=globals@entry=0x7fd147f50100, locals=locals@entry=0x7fd147f50100) at Python/ceval.c:828
#34 0x00007fd14f3e3123 in run_eval_code_obj (tstate=0x89e150, co=0x7fd147db1500, globals=0x7fd147f50100, locals=0x7fd147f50100) at Python/pythonrun.c:1221
#35 0x00007fd14f3e30aa in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x7fd147f50100, locals=0x7fd147f50100, flags=<optimized out>, arena=<optimized out>) at Python/pythonrun.c:1242
#36 0x00007fd14f2b77a1 in pyrun_file (fp=fp@entry=0x8d81f0, filename=filename@entry=0x7fd147d97b70, start=start@entry=257, globals=globals@entry=0x7fd147f50100, locals=locals@entry=0x7fd147f50100, closeit=closeit@entry=1, 
    flags=0x7ffe748d3e88) at Python/pythonrun.c:1140
#37 0x00007fd14f2b754d in pyrun_simple_file (flags=0x7ffe748d3e88, closeit=1, filename=0x7fd147d97b70, fp=0x8d81f0) at Python/pythonrun.c:450
#38 PyRun_SimpleFileExFlags (fp=fp@entry=0x8d81f0, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7ffe748d3e88) at Python/pythonrun.c:483
#39 0x00007fd14f2b840d in PyRun_AnyFileExFlags (fp=fp@entry=0x8d81f0, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7ffe748d3e88) at Python/pythonrun.c:92
#40 0x00007fd14f3eac02 in pymain_run_file (cf=0x7ffe748d3e88, config=0x89ac90) at Modules/main.c:373
#41 pymain_run_python (exitcode=0x7ffe748d3e80) at Modules/main.c:598
#42 Py_RunMain () at Modules/main.c:677
#43 0x00007fd14f3ea757 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:731
#44 0x00007fd14e4fe555 in __libc_start_main () from /lib64/libc.so.6
#45 0x000000000040108e in _start ()

相关结构定义:


size_t upb_Array_Size(const upb_Array* arr) { return arr->size; }

/* Our internal representation for repeated fields.  */
struct upb_Array {
  uintptr_t data;  /* Tagged ptr: low 3 bits of ptr are lg2(elem size). */
  size_t size;     /* The number of elements in the array. */
  size_t capacity; /* Allocated storage. Measured in elements. */
  uint64_t junk;
};

System V X86_64 调用约定 中参数传入的寄存器顺序为: rdi, rsi, rdx, rcx, r8, r9

(gdb) info all-registers
rax            0x0                 0
rbx            0x7fd1473bb2f0      140536819987184
rcx            0x0                 0
rdx            0x7fd1477d77b0      140536824297392
rsi            0x9b8af0            10193648
rdi            0x0                 0
rbp            0x7fd14753fa40      0x7fd14753fa40
rsp            0x7ffe748d3288      0x7ffe748d3288
r8             0x243f6a8885a308d3  2611923443488327891
r9             0x3f                63
r10            0xfcabbfeccfd89b2a  -239887131313923286
r11            0x13198a2e03707344  1376283091369227076
r12            0x7fd147cdc06e      140536829558894
r13            0x7fd14753fbf0      140536821578736
r14            0xd03df0            13647344
r15            0xa40328            10748712
rip            0x7fd147b64b20      0x7fd147b64b20 <upb_Array_Size>

以上,PyUpb_ExtensionDict_Containsupb_Array_Size 的调用,参数传入了 nullptr

这个BUG已经提交到 https://github.com/protocolbuffers/protobuf/issues/10396 ,有碰到同样问题的小伙伴也可以关注下进展。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK