Uses UTF8MB4 everywhere#8425
Conversation
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
|
Looks like 90% of this is just removing and hardcoding UTF-8 on everything. InnoDB is a fairly safe conversion. Just may take longer for larger forums on certain tables, but nothing can be avoided in timeout protections for that. It looks good from what I see. |
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
|
I'll run some test upgrades. Merge conflicts need to be resolved first, though. Or is there a per-requisite PR? |
Signed-off-by: Jon Stovell <jonstovell@gmail.com> # Conflicts: # Sources/Db/APIs/MySQL.php
Great! Thank you. 🙂 Merge conflict has been resolved.
Nope. |
|
First test was an upgrade of a new, vanilla 2.1.4 forum to 3.0, via CLI. I followed the old 2.1.x protocol, where I would copy the upgrade files over from the /other folder, then run upgrade.php. DB: MySQL, version 8.4.0 Had a few errors, here is the complete output:
|
|
Okay, so it looks like we have some unrelated upgrader bugs to fix before you can even get to the point of testing the new ConvertUtf8() logic in this PR. Oh, the joys of the upgrader never cease. 😒 |
|
Looks like 2 things here...
|
|
Probably the same issues in a different form... But I attempted an install in the same environment. This comes up after entering the DB credentials, etc. Same issues in php 8.3 & 8.4:
|
|
Same errors occur in unix. |
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
I don't think that's an issue. Since database transactions are always atomic, changing the whole table at once whenever possible is actually better and safer. When we do one column at a time using the method where we change the column to a binary encoding and then back, an interruption at an inopportune moment can leave the column sitting there in a binary encoding. If anything, we should probably build more safety checks around the column-based method in the new upgrader code.
I don't think that's accurate. The old logic was inherited from SMF 2.0, when the mbstring extension was not required by SMF and therefore the upgrader couldn't rely on it. Regarding double encoding, all that I found on the matter was a single StackOverflow discussion in which the original poster was trying to do something different than we are (and didn't seem to understand what they were doing very well). Perhaps your searches turned up something mine didn't, though, so if there's something more, please share the link. I am often wrong, after all, and always glad to discover a better understanding. 🙂 |
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
I was just looking over the code again and noticed the spot that you were probably referring to; 14ea6ae should fix it. Again, ready for testing whenever you are. 🙂 |
|
Back in town & will get back to testing. Juggling a lot, so it may take a while. A couple notes on the above... First, this logic does go column by column when bouncing off of binary... See lines 3758+. Also, mb_convert_encoding pretty much only does what you tell it to, and sometimes that's a bad thing... The whole point of bouncing off of binary is to make use of MySQL's charset detection, which is far better than PHPs. The real issue is a lot of legacy latin1, win1251, etc., with various other encodings stuffed into it, from back in the day when php & mysql were kinda awful at it... Run this: Food for thought. |
|
I've made a change in order to let MySQL handle character set detection internally whenever and wherever possible. We now only do it manually for character sets that MySQL does not have native support for. I believe this will address your concerns, @sbulen. |
615425e to
ff8d696
Compare
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
ff8d696 to
897e76c
Compare
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
|
@sbulen, have you had a chance to test the latest changes yet? |
|
OK, I snuck in a few more tests (yabb, 1.0 & 1.1), this is all I can do for today: 3.0 Install OK 🍻 1.1 issue:
1.0 issue, very similar to 1.1:
Yabbse issue - didn't get very far at all... (Yabbse is so different, I wonder if it's time to punt on this support...):
|
|
Uh... The 1.0 & 1.1 default for session is... uh... odd: In Installer: |
|
All tests run in php 8.4.2 & mysql 8.4.0. |
|
We can look into the issue with YaBB SE, but definitely not something we need to spend a lot of time on. It's been 21 years since YaBB SE 1.5.5 was released and a little over 20 since SMF 1.0 was released (YaBB SE 1.5.5 was released in January of 2004, SMF 1.0 final was released in late December of 2004). If someone is still running YaBB SE at this point, it's not likely they're ever going to upgrade. |
For this PR we only care about 2.1 → 3.0 and 3.0 → 3.0, and specifically about the conversion to utf8mb4 for MySQL. There's no need to run tests on anything else at this point. Regarding the 3.0 → 3.0, what went wrong? |
I would hope we plan on doing the same thing? These upgrades used to work, & I think this is a clue the changes & new approach aren't working. I suspect returning to the single command would work, rather than spreading it out over several steps. Haven't tested that - I've been pretty busy lately with multiple RL challenges.
CLI just returns & does nothing. |
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Well, I can't reproduce that. 3.0 → 3.0 completes successfully for me, including the ConvertToUtf8() step, which is the only step we care about in this PR. I don't know what is causing 3.0 → 3.0 to fail for you right now, but since (a) it is happening at the first step and (b) you have run into the same problem previously, I don't think it is related to the changes in this PR.
Although 1.0 and 1.1 issues are out of scope for this PR, I've added a fix for that in 9f3409d.
Does it fail during ConvertToUtf8(), or somewhere else?
That'll be a problem to deal with in #8093 |
|
The 2.0 Latin 1 failure was highlighted up here - at first glance, it appears to be having issues with BLOBs? And yes, that's in the utf8mb4 conversion step. I have not attempted a retest recently yet. Note my 3.0 => 3.0 failure is due to $upcontext['language'] being blank on this line: Line 1050 in 783d8eb My steps:
I haven't looked too closely, not a lot of time today... My initial suspicion is there is a confusion between 'lang' and 'language', both are used at different points around there??? Or maybe Config language isn't set anywhere? Note also that I don't think the throw error logic is working there, as the thrown error is not visible anywhere. Just a 'try again' link via browser, or a clean exit via CLI. |
Thanks, I guess I overlooked that. 🙂 Unfortunately, I cannot reproduce that either. However, my test data for the 2.0 → 3.0 upgrade is just content generated by Populate.php, so perhaps that generated content isn't creating the right conditions to trigger the problem. Could we arrange a way for me to get copies of the databases you are using, @sbulen? I would like to run tests using the same data as you are using so that I can try to figure out what the cause is. In the meantime, though, I think I am going to go ahead and merge this for now. Even if there are still kinks to work out with the upgrader, waiting for the rest of the pending changes in this PR is holding up everything else in the development pipeline. Once I can get ahold of a copy of your test data and figure out the cause of the issues you are seeing, we will be able to fix the upgrade problems in the dedicated PR for upgrader changes. |
|
OK. I'll attempt to reproduce these two issues with current GH. If I can reproduce, I'll write them up. Note that for both issues, I started with a simple fresh install. No other content. |
Hm. So, just to make sure that I understand correctly, the database you were using for the 2.0 → 3.0 upgrade was a fresh, empty install of 2.0 that you immediately upgraded to 3.0? If I am indeed understanding correctly, then the fact that I cannot reproduce the problem is weird. |
Yep. latin1 db. |




Fixes #7938
Fixes #7173
Closes #6409
Closes #6406