It's because unicode
was really broken, and a lot of the obvious breakage was when people mixed the two. So they did fix some of the obvious breakage, but they left a lot of the subtle breakage (in addition to breaking a lot of existing correct code, and introducing a completely nonsensical bytes
class).
o11c
I've only ever seen two parts of git that could arguably be called unintuitive, and they both got fixes:
git reset
seems to do 2 unrelated things for some people. Nowadaysgit restore
exists.- the inconsistent difference between
a..b
anda...b
commit ranges in various commands. This is admittedly obscure enough that I would have to look up the manual half the time anyway. - I suppose we could call the fact that
man git foo
didn't used to work unintuitive I guess.
The tooling to integrate git submodule
into normal tree operations could be improved though. But nowadays there's git subtree
for all the people who want to do it wrong but easily.
The only reason people complain so much about git is that it's the only VCS that's actually widely used anymore. All the others have worse problems, but there's nobody left to complain about them.
Python 2 had one mostly-working str
class, and a mostly-broken unicode
class.
Python 3, for some reason, got rid of the one that mostly worked, leaving no replacement. The closest you can get is to spam surrogateescape
everywhere, which is both incorrect and has significant performance cost - and that still leaves several APIs unavailable.
Simply removing str
indexing would've fixed the common user mistake if that was really desirable. It's not like unicode
indexing is meaningful either, and now large amounts of historical data can no longer be accessed from Python.
The problem is that there's a severe hole in the ABCs: there is no distinction between "container whose elements are mutable" and "container whose elements and size are mutable".
(related, there's no distinction for supporting slice operations or not, e.g. deque
)
All of these can be done with raw strings just fine.
For the first pathlib
bug case, PATH
-like lookup is common, not just for binaries but also data and conf files. If users explicitly request ./foo
they will be very upset if your program instead looks at /defaultpath/foo
. Also, God forbid you dare pass a Path("./--help")
to some program. If you're using os.path.dirname
this works just fine.
For the second pathlib
bug case, dir/
is often written so that you'll cause explicit errors if there's a file by that name. Also there are programs like rsync
where the trailing slash outright changes the meaning of the command. Again, os.path
APIs give you the correct result.
For the article mistake, backslash is a perfectly legal character in non-Windows filenames and should not be treated as a directory component separator. Thankfully, pathlib
doesn't make this mistake at least. OTOH, /
is reasonable to treat as a directory component separator on Windows (and some native APIs already handle it, though normalization is always a problem).
I also just found that the pathlib.Path
constructor ignores extra kwargs. But Python has never bothered much with safety anyway, and this minor compared to the outright bugs the other issues cause.
The problem with pathlib
is that it normalizes away critical information so can't be used in many situations.
./path
should not be path
should not be path/
.
Also the article is wrong about "Path('some\\path')
becomes some/path
on Linux/Mac."
I've done something similar. In my case it was a startup script that did something like the following:
- poll github using the search API for PR labels (note that this has sometimes stopped returning correct results, but ...).
- always do this once at startup
- you might do this based on notifications; I didn't bother since I didn't need rapid responsiveness. Note that you should not do this for the specific data from a notification though; it's only a way to wake up the script.
- but no matter what, you should do this after N minutes, since notifications can be lost.
- perform a
git fetch
for your main development branch (the one you perform the real merges to) and allpull/
refs (git does not do this by default; you'll have to set them up for your local test repo. Note that you want to refer to the unmerged commits for these) - if the set of commits for all tagged PRs has not changed, wait and poll again
- reset the test repo to the most recent commit from your main development branch
- iterate over all PRs with the appropriate label:
- ordering notes:
- if there are commits that have previously tested successfully, you might do them first. But still test again since the merge order could be different. This of course depends on the level of tests you're doing.
- if you have PRs that depend on other PRs, do them in an appropriate order (perhaps the following will suffice, or maybe you'll have some way of detecting this). As a rule we soft-forbid this though; such PRs should have been merged early.
- finally, ordering by PR number is probably better than ordering by last commit date
- attempt the merge (or rebase). If a nop, log that somewhere. If not clean, skip the PR for now (and log that), but only mark this as an error if it was the first PR you've merged (since if there's a conflict it could be a prior PR's fault).
- Run pre-build stuff that might need to create further commits, build the product, and run some quick tests. If they fail, rollback the repo to the previous merge and complain.
- Mark the commit as apparently good. Note that this is specifically applying to commits not PRs or branch names; I admit I've been sloppy above.
- ordering notes:
- perform a pre-build, build and quick test again (since we may have rolled back and have a dirty build - in fact, we might not have ended up merging anything!)
- if you have expensive tests, run them only here (and treat this as "unexpected early exit" below). It's presumed that separate parts of your codebase aren't too crazily entangled, so if a particular test fails it should be "obvious" which PR is relevant. Keep in mind that I used this system for assumed viable-work-in-progress PRs.
- kill any existing instance and launch a new instance of the product using the build from the final merged commit and begin accepting real traffic from devs and beta users.
- users connecting to the instance should see the log
- if the launched instance exits unexpectedly within M minutes AND we actually ended up merging anything into the known-good branch, then reset to the main development branch (and build etc.) so that people at least have a functioning test server, but complain loudly in the MOTD when they connect to it. The condition here means that if it exits suddenly again the whole script goes up and starts again, which may be necessary if someone intentionally tried to kill the server to force a new merge sequence but it was too soon.
- alternatively you could try bisecting the set of PR commits or something, but I never bothered. Note that you probably can't use
git bisect
for this since you explicitly do not want to try commit from the middle of a PR. It might be simpler to whitelist or blacklist one commit at a time, but if you're failing here remember that all tests are unreliable.
- alternatively you could try bisecting the set of PR commits or something, but I never bothered. Note that you probably can't use
ReplaceFile
exists to get everyone else's semantics though?