Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RHEL / Rocky Linux 9: shipped sqlite version breaks system tools like dnf #8

Open
dfied opened this issue Dec 25, 2024 · 6 comments
Open
Assignees

Comments

@dfied
Copy link

dfied commented Dec 25, 2024

Hi,

the newly shipped sqlite package seems to break dnf which ends up in a segfault.
Maybe other local system tools are affected, too. I have also trouble accessing my server by ssh which uses fail2ban (which afaik uses sqlite, too).
Please rollback that updated package. I guess it's problematic updating the system wide sqlite libs.
A solution could be sqlite libs that are exclusively for grommunio services.

bye
Daniel

@dfied
Copy link
Author

dfied commented Dec 26, 2024

Just an update: I've downgraded sqlite to the distribution shipped version and dnf works again as expected. Also I can't see any errors on gromox - I guess the packaged sqlite version is a preparation for introducing CONCAT function in sql statements again, see grommunio/gromox#114.
I also blocked updating sqlite-* from grommunio repo via editing the repo file (excludepkgs=sqlite-*).

@jengelh jengelh self-assigned this Dec 26, 2024
@jengelh
Copy link
Member

jengelh commented Dec 26, 2024

  • I can reproduce it
  • it does not look like sqlite is at fault, but I am still busy in gdb
  • sqlite3 is purged from download.grommunio.com/*/EL9 for the time being
  • there are no plans to use CONCAT() in gromox; sqlite was needed for g-index

@robert-scheck
Copy link
Contributor

Being one of the very first EPEL package maintainers ever (and being also package maintainer and/or contributor for some other 3rd-party Fedora and/or EL repositories), I would like to stress, that it is a terrible practice to replace a package from the base/core operating system, such as sqlite.

First of all, your package might be built with other compiler versions, tools and different flags, which may or may not have an impact during run-time (given you are using Open Build Service instead of mock, it's IMHO very different). Further on, the CentOS/RHEL/Rocky Linux 9 sqlite package might contain essential patches that your package might not contain, leading to different behaviour. Next is that newer versions of SQLite might contain new features (usually desired, but still may contain breaking changes), but also new bugs and security flaws. Once you replace a distribution package, you then need to provide updated sqlite packages until the end of life; usually a burden that a 3rd-party repository doesn't want or simply can't provide for long-term (given the lifecycle of gromox/grommunio versions is most likely shorter than the lifecycle of RHEL/Rocky Linux 9, there is already a conflict of interests). A real-world example for having various issues due to replacing base/core operating system packages was ATrpms (sorry, Axel!) for Fedora.

If you really rely on CONCAT() in sqlite package for grommunio, please go the correct and professional way: You are an Independent Software Vendor (ISV), so, if you haven't done already, please register as a Red Hat ISV partner, file a ticket in the Red Hat customer portal for RHEL 9 and describe your needs. Red Hat will then, once your request has been reviewed and approved by engineering, backport the required CONCAT() functionality to the sqlite package in RHEL 9.6 or later (providing a patch that backports the desired feature to the sqlite package in the base/core operating system might help, but isn't strictly necessary). Note that Red Hat most likely won't rebase the sqlite package on a newer version, but only perform backports of the specific features.

Alternatively, provide a non-replacing sqlite package like gromox-sqlite with the desired SQLite libraries and tools on a non-default path (nothing must catch up these files by default) and make gromox/grommunio using that ones explicitly. But that still leaves the burden of further maintenance for future security issues.

@jengelh
Copy link
Member

jengelh commented Dec 30, 2024

dnf/librpm in EL9 is oddly broken. No matter which sqlite version you have, after execution of "dnf", a WAL file in /var/lib/dnf/history.sqlite-wal is left behind. That ought to better not happen. dnf/librpm probably forgets to call sqlite3_close/shutdown or something. Anyway, this causes sqlite to do WAL recovery on the next run. While sqlite does that, sqlite logs messages, at least it tries to:

gdb /usr/bin/python3
(gdb) b sqlite3_log
(gdb) r /usr/bin/dnf update

Breakpoint 1, sqlite3_log (iErrCode=21, zFormat=0x7ffff63020ac "%s at line %d of [%.10s]")
    at /usr/src/debug/sqlite-3.34.1-7.el9_3.x86_64/sqlite3.c:29489
(gdb) c
Breakpoint 1, sqlite3_log (iErrCode=21, zFormat=0x7ffff63020ac "%s at line %d of [%.10s]")
    at /usr/src/debug/sqlite-3.34.1-7.el9_3.x86_64/sqlite3.c:29489
(gdb) 
Continuing.
Last metadata expiration check: 1:19:22 ago on Mon 30 Dec 2024 00:34:44 CET.

Breakpoint 1, sqlite3_log (iErrCode=283, zFormat=0x7ffff6302c08 "recovered %d frames from WAL file %s")
    at /usr/src/debug/sqlite-3.34.1-7.el9_3.x86_64/sqlite3.c:29489
29489   SQLITE_API void sqlite3_log(int iErrCode, const char *zFormat, ...){
29491     if( sqlite3GlobalConfig.xLog ){
29492       va_start(ap, zFormat);
29493       renderLogMsg(iErrCode, zFormat, ap);

The xLog pointer is NULL, so nothing is ever logged, so the crash does not trigger. The reason xLog is NULL is because of some blunders in rpm 4.16, where rpm tries to call sqlite3_config (line 168) after sqlite3_open_v2 (line 145):

#0  sqlite_init (dbhome=<optimized out>, rdb=0x5555559f9500) at /usr/src/debug/rpm-4.16.1.3-34.el9.0.1.x86_64/lib/backend/sqlite.c:147
144             while (retry_open--) {
145                 xx = sqlite3_open_v2(dbfile, &sdb, flags, NULL);
146                 /* Attempt to create if missing, discarding OPEN_READONLY (!) */
147                 if (xx == SQLITE_CANTOPEN && (flags & SQLITE_OPEN_READONLY)) {
148                     /* Sqlite allocates resources even on failure to open (!) */
149                     sqlite3_close(sdb);
150                     flags &= ~SQLITE_OPEN_READONLY;
151                     flags |= (SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE);
152                     retry_open++;
153                 }
154             }
155
156             if (xx != SQLITE_OK) {
157                 rpmlog(RPMLOG_ERR, _("Unable to open sqlite database %s: %s\n"),
158                         dbfile, sqlite3_errstr(xx));
159                 rc = 1;
160                 goto exit;
161             }
162
163             sqlite3_create_function(sdb, "match", 3,
164                                     (SQLITE_UTF8|SQLITE_DETERMINISTIC),
165                                     NULL, rpm_match3, NULL, NULL);
166
167             sqlite3_busy_timeout(sdb, sleep_ms);
168             sqlite3_config(SQLITE_CONFIG_LOG, errCb, rdb);
169
170             sqlexec(sdb, "PRAGMA secure_delete = OFF");
171             sqlexec(sdb, "PRAGMA case_sensitive_like = ON");

and that's not allowed by sqlite 3.34:

162243    /* sqlite3_config() shall return SQLITE_MISUSE if it is invoked while
162244    ** the SQLite library is in use. */
162245    if( sqlite3GlobalConfig.isInit ) return SQLITE_MISUSE_BKPT;

sqlite in fact tried to log the occurrence of SQLITE_MISUSE_BKPT, which is the "%s at line %d of [%.10s]" log attempt from earlier.
Now what's different in sqlite 3.47:

425       /* sqlite3_config() normally returns SQLITE_MISUSE if it is invoked while
426       ** the SQLite library is in use.  Except, a few selected opcodes
427       ** are allowed.
428       */

Thus, logging does happen, and then you notice the third bug of the day, a use-after-free.

LD_LIBRARY_PATH=sqlite-3.47/.libs gdb /usr/bin/python3
(gdb) r /usr/bin/dnf update
(gdb) b renderLogMsg
(gdb) b sqlite3_config
...
Breakpoint 2, sqlite3_config (op=op@entry=16) at /root/sqlite-src-3470200/src/main.c:429
429       if( sqlite3GlobalConfig.isInit ){
(gdb) up
grommunio/gromox#1  0x00007ffff63d4847 in sqlite_init (dbhome=<optimized out>, rdb=0x5555559f9500) at backend/sqlite.c:168
168             sqlite3_config(SQLITE_CONFIG_LOG, errCb, rdb);
(gdb) p rdb
$89 = (rpmdb) 0x5555559f9500
(gdb) p rdb[0]
$90 = {db_root = 0x5555559f9af0 "/", db_home = 0x5555559f9ad0 "/var/lib/rpm", db_fullpath = 0x5555559f9b50 "/var/lib/rpm", 
  db_flags = 0, db_mode = 0, db_perms = 420, db_descr = 0x7ffff63d73f1 "sqlite", db_checked = 0x0, db_next = 0x0, db_opens = 0, 
  db_pkgs = 0x0, db_tags = 0x7ffff63d9b20 <dbiTags.2>, db_ndbi = 18, db_indexes = 0x5555559f7cc0, db_buildindex = 0, 
  db_ops = 0x7ffff63f3160 <sqlite_dbops>, db_dbenv = 0x0, db_cache = 0x0, cfg = {db_mmapsize = 0, db_cachesize = 0, db_verbose = 0, 
    db_no_fsync = 0, db_eflags = 0}, db_remove_env = 0, db_getops = {begin = {u = {tv = {tv_sec = 0, tv_usec = 0}, ticks = 0, tocks = {
          0, 0}}}, count = 0, bytes = 0, usecs = 0}, db_putops = {begin = {u = {tv = {tv_sec = 0, tv_usec = 0}, ticks = 0, tocks = {0, 
          0}}}, count = 0, bytes = 0, usecs = 0}, db_delops = {begin = {u = {tv = {tv_sec = 0, tv_usec = 0}, ticks = 0, tocks = {0, 
          0}}}, count = 0, bytes = 0, usecs = 0}, nrefs = 1}
(gdb) c
Continuing.
Repository 'gr' is missing name in configuration, using id.

Breakpoint 2, sqlite3_config (op=op@entry=16) at /root/sqlite-src-3470200/src/main.c:429
429       if( sqlite3GlobalConfig.isInit ){
(gdb) up
grommunio/gromox#1  0x00007ffff63d4847 in sqlite_init (dbhome=<optimized out>, rdb=0x555555b87360) at backend/sqlite.c:168
168             sqlite3_config(SQLITE_CONFIG_LOG, errCb, rdb);
(gdb) p rdb
$91 = (rpmdb) 0x555555b87360
(gdb) p rdb[0]
$92 = {db_root = 0x555555b87490 "/", db_home = 0x555555b87340 "/var/lib/rpm", db_fullpath = 0x555555b874f0 "/var/lib/rpm", 
  db_flags = 0, db_mode = 0, db_perms = 420, db_descr = 0x7ffff63d73f1 "sqlite", db_checked = 0x0, db_next = 0x0, db_opens = 0, 
  db_pkgs = 0x0, db_tags = 0x7ffff63d9b20 <dbiTags.2>, db_ndbi = 18, db_indexes = 0x555555b85170, db_buildindex = 0, 
  db_ops = 0x7ffff63f3160 <sqlite_dbops>, db_dbenv = 0x0, db_cache = 0x0, cfg = {db_mmapsize = 0, db_cachesize = 0, db_verbose = 0, 
    db_no_fsync = 0, db_eflags = 0}, db_remove_env = 0, db_getops = {begin = {u = {tv = {tv_sec = 0, tv_usec = 0}, ticks = 0, tocks = {
          0, 0}}}, count = 0, bytes = 0, usecs = 0}, db_putops = {begin = {u = {tv = {tv_sec = 0, tv_usec = 0}, ticks = 0, tocks = {0, 
          0}}}, count = 0, bytes = 0, usecs = 0}, db_delops = {begin = {u = {tv = {tv_sec = 0, tv_usec = 0}, ticks = 0, tocks = {0, 
          0}}}, count = 0, bytes = 0, usecs = 0}, nrefs = 1}

(gdb) c
Continuing.
Last metadata expiration check: 1:00:14 ago on Mon 30 Dec 2024 00:34:44 CET.

Breakpoint 1, renderLogMsg (iErrCode=283, zFormat=0x7ffff6305a00 "recovered %d frames from WAL file %s", ap=ap@entry=0x7fffffffb4b8)
    at /root/sqlite-src-3470200/src/printf.c:1325
1325      sqlite3GlobalConfig.xLog(sqlite3GlobalConfig.pLogArg, iErrCode,
(gdb) p sqlite3Config.pLogArg
$93 = (void *) 0x555555b87360
(gdb) p *(rpmdb)sqlite3Config.pLogArg
$94 = {db_root = 0x3638782f534f6573 <error: Cannot access memory at address 0x3638782f534f6573>, 
  db_home = 0x2f736f2f34365f <error: Cannot access memory at address 0x2f736f2f34365f>, db_fullpath = 0x0, db_flags = 81, db_mode = 0, 
  db_perms = 1886680168, db_descr = 0x2e73726f7272696d <error: Cannot access memory at address 0x2e73726f7272696d>, 
  db_checked = 0x6e72756f626c656d, db_next = 0x2f6b752e6f632e65, db_opens = 1801678706, db_pkgs = 0x534f657361422f35, 
  db_tags = 0x2f34365f3638782f, db_ndbi = 3109743, db_indexes = 0x0, db_buildindex = 81, db_ops = 0x2f2f3a7370747468, 
  db_dbenv = 0x696c2d796b636f72, db_cache = 0x61796f6b2e78756e, cfg = {db_mmapsize = 779380078, db_cachesize = 1915713132, 
    db_verbose = 2037080943, db_no_fsync = 1852402733, db_eflags = 959412341}, db_remove_env = 1110390062, db_getops = {begin = {u = {
        tv = {tv_sec = 4069054298190082913, tv_usec = 3419198790634463030}, ticks = 4069054298190082913, tocks = {4069054298190082913, 
          3419198790634463030}}}, count = 0, bytes = 65, usecs = 3400000511170344040}, db_putops = {begin = {u = {tv = {
          tv_sec = 7022288818005045106, tv_usec = 8083505465032338291}, ticks = 7022288818005045106, tocks = {7022288818005045106, 
          8083505465032338291}}}, count = 775499628, bytes = 3907004821400413555, usecs = 13356245275915871}, db_delops = {begin = {
      u = {tv = {tv_sec = 0, tv_usec = 65}, ticks = 0, tocks = {0, 65}}}, count = 1886680168, bytes = 8030606882683909225, 
    usecs = 8460965629232313202}, nrefs = 795176238}

Collectively that is fixed for rpm 4.20/4.19/4.18.2.

@dfied
Copy link
Author

dfied commented Dec 30, 2024

@jengelh Great work!
But I'd like to stress the comment of @robert-scheck above. I'd really like to see that gromox ships a version of sqlite for gromox (or just g-index) and gromox leaves the system-wide package untouched.

Furthermore the rpm version of EL9 might be fixed (currently 4.16.1.3 in my case). So we need to wait for Redhat to backport the fixes from 4.20/4.19/4.18.2.

@robert-scheck
Copy link
Contributor

No matter how broken dnf and/or librpm in RHEL 9 is…it still worked until sqlite got updated by the grommunio repository.

Furthermore the rpm version of EL9 might be fixed (currently 4.16.1.3 in my case). So we need to wait for Redhat to backport the fixes from 4.20/4.19/4.18.2.

Upstream rpm.org treats the 4.16 release series as no longer supported, thus I wouldn't hold my breath for a 4.16 bugfix release. And I also wouldn't expect a rebase to a version later than 4.16, given it hasn't already happened the last 2.5+ years (and it's not typical for how Red Hat acts).

If there is a bug that should be fixed in RHEL 9, a support ticket needs to be raised, otherwise a fix, especially if it's not a really common issue, is quite unlikely. Qualified customer- or partner-raised issues using the Red Hat customer portal are most likely to be addressed. Oh, and "fix dnf and/or librpm that we can replace sqlite by another version" won't be a valid reason ;-)

Nevertheless, it's still a terrible practice to replace a package from the base/core operating system. Even more important, this practice causes that Red Hat won't provide any support for such a RHEL 9 system anymore. So you simply loose the value of your RHEL subscription (yes, users of unsupported RHEL clones might not care about this detail).

I'd really like to see that gromox ships a version of sqlite for gromox (or just g-index) and gromox leaves the system-wide package untouched.

If there is really a need for a newer version of SQLite in RHEL 9, please either ask Red Hat for a backport (grommunio is an ISV and should be in the position to do so) or bundle your desired SQLite version in grommunio/gromox/g-index, but please leave the system-provided SQLite untouched.

@jengelh jengelh transferred this issue from grommunio/gromox Dec 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants