Skip to content

gh-118761: Optimise import time for shlex#132036

Merged
AA-Turner merged 6 commits intopython:mainfrom
AA-Turner:opt-shlex
Apr 24, 2025
Merged

gh-118761: Optimise import time for shlex#132036
AA-Turner merged 6 commits intopython:mainfrom
AA-Turner:opt-shlex

Conversation

@AA-Turner
Copy link
Copy Markdown
Member

@AA-Turner AA-Turner commented Apr 2, 2025

This PR achieves a 33x improvement in import time for the shlex module and a 4-12x improvement in performance for shlex.quote.

Current:

import shlex: cumulative time
mean: 9065.967 µs
median: 9092.000 µs
stdev: 282.015
min: 8587
max: 9556

This PR:

import shlex: cumulative time
mean: 274.300 µs
median: 275.000 µs
stdev: 8.860
min: 259
max: 289

@AA-Turner AA-Turner added the performance Performance or resource usage label Apr 2, 2025
@JelleZijlstra
Copy link
Copy Markdown
Member

I'm not sure this is practically useful, since pretty much anything you'd do within the module will trigger the imports. Also, importing sys and io is practically free since they're imported at startup.

@AA-Turner
Copy link
Copy Markdown
Member Author

AA-Turner commented Apr 4, 2025

True, this one is more marginal. quote and join will only import re, whereas split and shlex will import collections/os.path. Where shlex is used in the stdlib, it's often only one of these two groups, so I think it could still have benefits?

Edit: quote and join now no longer import re

A

Lib/shlex.py Outdated
# to allow as deferred import of re for performance
global _find_unsafe
import re
_find_unsafe = re.compile(r'[^\w@%+=:,./-]', re.ASCII).search
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be replaced with an efficient string-method only alternative? we don't use the match result.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fastest I've found is using bytes.translate(). It's expensive to encode a Unicode string to bytes, but bytes.translate() is very fast. With this, we go from ~26 microseconds for quote to ~2 (with only safe characters) and ~6 (with characters needing quotation). This is anywhere from a 4x to 12x improvement.

Are you aware of any faster methods for this check? All of the other ones I tried were slower (str.translate, frozenset, all(map(...))), only bytes.translate had performance better than re.search.

safe_quote_translate: Mean +- std dev: 2.06 us +- 0.12 us
safe_quote_re: Mean +- std dev: 25.9 us +- 0.7 us
unsafe_quote_translate: Mean +- std dev: 6.33 us +- 0.46 us
unsafe_quote_re: Mean +- std dev: 26.8 us +- 0.7 us

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you aware of any faster methods for this check?

No, the faster I know about is bytes.translate. There are C functions that I would know but exposed ones, no.

@python-cla-bot

This comment was marked as resolved.

@AA-Turner AA-Turner requested a review from picnixz April 20, 2025 21:50
@AA-Turner AA-Turner requested a review from JelleZijlstra April 21, 2025 00:34
@AA-Turner AA-Turner merged commit 06a26fd into python:main Apr 24, 2025
39 checks passed
@AA-Turner AA-Turner deleted the opt-shlex branch April 24, 2025 15:10
@bonzini
Copy link
Copy Markdown

bonzini commented Mar 24, 2026

How was performance of quote measured? I actually see the new version being slower than the old one on Python 3.14.3 (from Fedora 44) when the quoting is needed:

>>> timeit.timeit("f.old_quote('the quick brown fox jumps over the lazy dog')", globals={'f':f}, number=1000000)
0.37302771798567846
>>> timeit.timeit("f.new_quote('the quick brown fox jumps over the lazy dog')", globals={'f':f}, number=1000000)
0.5016047579993028

and the same speed as before when the quoting is not necessary:

>>> timeit.timeit("f.old_quote('thequickbrownfoxjumpsoverthelazydog')", globals={'f':f}, number=1000000)
0.3631113300070865
>>> timeit.timeit("f.new_quote('thequickbrownfoxjumpsoverthelazydog')", globals={'f':f}, number=1000000)
0.3525112950010225

or even:

>>> timeit.timeit("f.old_quote('a')", globals={'f':f}, number=1000000)
0.11935025999264326
>>> timeit.timeit("f.new_quote('a')", globals={'f':f}, number=1000000)
0.28223515099671204

because regular expression search was able to short-circuit at the first unsafe character.

@hugovk
Copy link
Copy Markdown
Member

hugovk commented Mar 24, 2026

This is about the import time of shlex, not run time performance.

For example, it takes 4.6 ms on 3.13.11 to import shlex:

python3.13 -X importtime -c "import shlex"
Found existing alias for "python3.13". You should use: "p3"
import time: self [us] | cumulative | imported package
import time:       139 |        139 |   _io
import time:        24 |         24 |   marshal
import time:       205 |        205 |   posix
import time:       453 |        820 | _frozen_importlib_external
import time:       584 |        584 |   time
import time:       194 |        778 | zipimport
import time:        44 |         44 |     _codecs
import time:       465 |        508 |   codecs
import time:       423 |        423 |   encodings.aliases
import time:       597 |       1528 | encodings
import time:       142 |        142 | encodings.utf_8
import time:        57 |         57 | _signal
import time:        22 |         22 |     _abc
import time:       111 |        133 |   abc
import time:       123 |        255 | io
import time:        32 |         32 |       _stat
import time:        64 |         96 |     stat
import time:       605 |        605 |     _collections_abc
import time:        40 |         40 |       errno
import time:        62 |         62 |       genericpath
import time:       102 |        204 |     posixpath
import time:       370 |       1274 |   os
import time:        63 |         63 |   _sitebuiltins
import time:       166 |        166 |   encodings.utf_8_sig
import time:       233 |        233 |   types
import time:       182 |        182 |     importlib
import time:       252 |        252 |     importlib._abc
import time:       157 |        591 |   importlib.util
import time:        40 |         40 |   importlib.machinery
import time:       349 |        349 |   _distutils_hack
import time:       358 |        358 |   sitecustomize
import time:        68 |         68 |   usercustomize
import time:      4273 |       7412 | site
import time:       196 |        196 | linecache
import time:        94 |         94 |           itertools
import time:       121 |        121 |           keyword
import time:        46 |         46 |             _operator
import time:       416 |        461 |           operator
import time:       161 |        161 |           reprlib
import time:        46 |         46 |           _collections
import time:       536 |       1416 |         collections
import time:        34 |         34 |         _functools
import time:       404 |       1853 |       functools
import time:       932 |       2784 |     enum
import time:        51 |         51 |       _sre
import time:       190 |        190 |         re._constants
import time:       317 |        507 |       re._parser
import time:        87 |         87 |       re._casefix
import time:       295 |        938 |     re._compiler
import time:       140 |        140 |     copyreg
import time:       462 |       4323 |   re
import time:       263 |       4585 | shlex

Compared to 0.6 ms on 3.14.2:

python3.14 -X importtime -c "import shlex"
import time: self [us] | cumulative | imported package
import time:       154 |        154 |   _io
import time:        28 |         28 |   marshal
import time:       194 |        194 |   posix
import time:       522 |        897 | _frozen_importlib_external
import time:       672 |        672 |   time
import time:       215 |        887 | zipimport
import time:        41 |         41 |     _codecs
import time:       420 |        461 |   codecs
import time:       373 |        373 |   encodings.aliases
import time:       577 |       1410 | encodings
import time:       149 |        149 | encodings.utf_8
import time:        55 |         55 | _signal
import time:        24 |         24 |       _abc
import time:       117 |        141 |     abc
import time:        32 |         32 |       _stat
import time:        68 |        100 |     stat
import time:       632 |        632 |     _collections_abc
import time:        39 |         39 |       errno
import time:        68 |         68 |       genericpath
import time:       101 |        207 |     posixpath
import time:       391 |       1468 |   os
import time:        64 |         64 |   _sitebuiltins
import time:       163 |        163 |   encodings.utf_8_sig
import time:       937 |        937 |   _distutils_hack
import time:        28 |         28 |     _types
import time:       192 |        219 |   types
import time:       112 |        112 |     importlib
import time:       158 |        158 |     importlib._abc
import time:       116 |        384 |   importlib.util
import time:        41 |         41 |   importlib.machinery
import time:       635 |        635 |   sitecustomize
import time:        93 |         93 |   usercustomize
import time:      2443 |       6444 | site
import time:       222 |        222 | linecache
import time:       167 |        167 |   io
import time:       429 |        595 | shlex

@picnixz
Copy link
Copy Markdown
Member

picnixz commented Mar 24, 2026

There was also a minor improvement for quote which seems to have been affected by the bytes.translate change. Maybe we should revert the re import?

@bonzini
Copy link
Copy Markdown

bonzini commented Mar 24, 2026

The description says "a 4-12x improvement in performance for shlex.quote". That's what I was asking information about.

@picnixz
Copy link
Copy Markdown
Member

picnixz commented Mar 24, 2026

Could you open an issue with the relevant benchmarks and what f contains please?

bonzini added a commit to bonzini/cpython that referenced this pull request Mar 25, 2026
Commit 06a26fd ("pythongh-118761: Optimise import time for ``shlex`` (python#132036)")
when the input has to be quoted. This is because the regular expression
search was able to short-circuit at the first unsafe character.

Go back to the same algorithm as 3.13, but make the "import re" and compilation
of the regular expression lazy.

Testing s.isascii() makes shlex.quote() twice as fast in the non-ASCII
case, but costs up to 25% of the full run time (because it necessitates
an earlier isinstance check) if the string *is* ASCII.  The latter is
probably the common case, so drop the check.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance or resource usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants