MINI MINI MANI MO

Path : /lib/python2.7/site-packages/kitchen/text/
Current File : //lib/python2.7/site-packages/kitchen/text/misc.pyc
є
iн:Oc@s5dZddlZddlZddlZyddlZWnek
rSdZnXddlZddl	m
Z
ddlmZe
j
ГdZeeddГdd	ged
dГГZeejeeГГZejdГZed
ДZdddДZddДZdДZddДZddДZdZdS(s═
---------------------------------------------
Miscellaneous functions for manipulating text
---------------------------------------------

Collection of text functions that don't fit in another category.
i    N(tsets(tControlCharErrorg333333у?iiiiii s(?s)<[^>]*>|&#?\w+;cCs▒t|tГs'ttjdГГВnd}yt||dГWntk
rZd}nX|rЮtrЮ|rЮtj	|Г}|dt
krЮ|d}qЮn|sнd}n|S(s#Try to guess the encoding of a byte :class:`str`

    :arg byte_string: byte :class:`str` to guess the encoding of
    :kwarg disable_chardet: If this is True, we never attempt to use
        :mod:`chardet` to guess the encoding.  This is useful if you need to
        have reproducibility whether :mod:`chardet` is installed or not.
        Default: :data:`False`.
    :raises TypeError: if :attr:`byte_string` is not a byte :class:`str` type
    :returns: string containing a guess at the encoding of
        :attr:`byte_string`.  This is appropriate to pass as the encoding
        argument when encoding and decoding unicode strings.

    We start by attempting to decode the byte :class:`str` as :term:`UTF-8`.
    If this succeeds we tell the world it's :term:`UTF-8` text.  If it doesn't
    and :mod:`chardet` is installed on the system and :attr:`disable_chardet`
    is False this function will use it to try detecting the encoding of
    :attr:`byte_string`.  If it is not installed or :mod:`chardet` cannot
    determine the encoding with a high enough confidence then we rather
    arbitrarily claim that it is ``latin-1``.  Since ``latin-1`` will encode
    to every byte, decoding from ``latin-1`` to :class:`unicode` will not
    cause :exc:`UnicodeErrors` although the output might be mangled.
    s'byte_string must be a byte string (str)sutf-8tstrictt
confidencetencodingslatin-1N(t
isinstancetstrt	TypeErrortktb_tunicodetUnicodeDecodeErrortNonetchardettdetectt_CHARDET_THRESHHOLD(tbyte_stringtdisable_chardettinput_encodingtdetection_info((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytguess_encoding;s
	sutf-8treplacecCszy||ko||kSWntk
r/nXt|tГrT|j||Г}n|j||Г}||krvtStS(s▌Compare two stringsi, converting to byte :class:`str` if one is
    :class:`unicode`

    :arg str1: First string to compare
    :arg str2: Second string to compare
    :kwarg encoding: If we need to convert one string into a byte :class:`str`
        to compare, the encoding to use.  Default is :term:`utf-8`.
    :kwarg errors: What to do if we encounter errors when encoding the string.
        See the :func:`kitchen.text.converters.to_bytes` documentation for
        possible values.  The default is ``replace``.

    This function prevents :exc:`UnicodeError` (python-2.4 or less) and
    :exc:`UnicodeWarning` (python 2.5 and higher) when we compare
    a :class:`unicode` string to a byte :class:`str`.  The errors normally
    arise because the conversion is done to :term:`ASCII`.  This function
    lets you convert to :term:`utf-8` or another encoding instead.

    .. note::

        When we need to convert one of the strings from :class:`unicode` in
        order to compare them we convert the :class:`unicode` string into
        a byte :class:`str`.  That means that strings can compare differently
        if you use different encodings for each.

    Note that ``str1 == str2`` is faster than this function if you can accept
    the following limitations:

    * Limited to python-2.5+ (otherwise a :exc:`UnicodeDecodeError` may be
      thrown)
    * Will generate a :exc:`UnicodeWarning` if non-:term:`ASCII` byte
      :class:`str` is compared to :class:`unicode` string.
    (tUnicodeErrorRR
tencodetTruetFalse(tstr1tstr2Rterrors((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytstr_eqds!
cCst|tГs'ttjdГГВn|dkrXtttdgt	tГГГ}nд|dkrЙtttdgt	tГГГ}ns|dkrчd}t
|Г}gtD]}||krо|^qоr№ttjdГГВq№nt
tjdГГВ|r|j|Г}n|S(	s Look for and transform :term:`control characters` in a string

    :arg string: string to search for and transform :term:`control characters`
        within
    :kwarg strategy: XML does not allow :term:`ASCII` :term:`control
        characters`.  When we encounter those we need to know what to do.
        Valid options are:

        :replace: (default) Replace the :term:`control characters`
            with ``"?"``
        :ignore: Remove the characters altogether from the output
        :strict: Raise a :exc:`~kitchen.text.exceptions.ControlCharError` when
            we encounter a control character
    :raises TypeError: if :attr:`string` is not a unicode string.
    :raises ValueError: if the strategy is not one of replace, ignore, or
        strict.
    :raises kitchen.text.exceptions.ControlCharError: if the strategy is
        ``strict`` and a :term:`control character` is present in the
        :attr:`string`
    :returns: :class:`unicode` string with no :term:`control characters` in
        it.
    sDprocess_control_char must have a unicode type as the first argument.tignoreRu?Rs*ASCII control code present in string inputsXThe strategy argument to process_control_chars must be one of ignore, replace, or strictN(RR
RRR	tdicttzipt_CONTROL_CODESRtlent	frozensett_CONTROL_CHARSRt
ValueErrort	translate(tstringtstrategyt
control_tabletdatatc((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytprocess_control_charsУs%%%cCsCdД}t|tГs0ttjdГГВntjt||ГS(s/Substitute unicode characters for HTML entities

    :arg string: :class:`unicode` string to substitute out html entities
    :raises TypeError: if something other than a :class:`unicode` string is
        given
    :rtype: :class:`unicode` string
    :returns: The plain text without html entities
    cSs |jdГ}|d dkr#dS|d dkrПyE|d dkr`tt|dd	!d
ГГStt|dd	!ГГSWqtk
rЛqXnН|d dkrtjj|dd	!jdГГ}|r|d d
kr	ytt|dd	!ГГSWqtk
rqXqt|dГSqn|S(Niiu<tiu&#iu&#xi    iu&sutf-8s&#s
iso-8859-1(	tgrouptunichrtintR%thtmlentitydefst
entitydefstgetRR
(tmatchR'tentity((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytfixup╘s(
"
sFhtml_entities_unescape must have a unicode type for its first argument(RR
RRR	tretsubt
_ENTITY_RE(R'R6((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pythtml_entities_unescape╦s		cCs^t|tГstSyt||Г}Wntk
r:tSXt|Г}|jtГrZtStS(s├Check that a byte :class:`str` would be valid in xml

    :arg byte_string: Byte :class:`str` to check
    :arg encoding: Encoding of the xml file.  Default: :term:`UTF-8`
    :returns: :data:`True` if the string is valid.  :data:`False` if it would
        be invalid in the xml file

    In some cases you'll have a whole bunch of byte strings and rather than
    transforming them to :class:`unicode` and back to byte :class:`str` for
    output to xml, you will just want to make sure they work with the xml file
    you're constructing.  This function will help you do that.  Example::

        ARRAY_OF_MOSTLY_UTF8_STRINGS = [...]
        processed_array = []
        for string in ARRAY_OF_MOSTLY_UTF8_STRINGS:
            if byte_string_valid_xml(string, 'utf-8'):
                processed_array.append(string)
            else:
                processed_array.append(guess_bytes_to_xml(string, encoding='utf-8'))
        output_xml(processed_array)
    (	RRRR
RR#tintersectionR$R(RRtu_stringR*((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytbyte_string_valid_xmlїs
cCs*yt||ГWntk
r%tSXtS(s╘Detect if a byte :class:`str` is valid in a specific encoding

    :arg byte_string: Byte :class:`str` to test for bytes not valid in this
        encoding
    :kwarg encoding: encoding to test against.  Defaults to :term:`UTF-8`.
    :returns: :data:`True` if there are no invalid :term:`UTF-8` characters.
        :data:`False` if an invalid character is detected.

    .. note::

        This function checks whether the byte :class:`str` is valid in the
        specified encoding.  It **does not** detect whether the byte
        :class:`str` actually was encoded in that encoding.  If you want that
        sort of functionality, you probably want to use
        :func:`~kitchen.text.misc.guess_encoding` instead.
    (R
RRR(RR((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pytbyte_string_valid_encodings

R>R=RR:R,R(sbyte_string_valid_encodingsbyte_string_valid_xmlsguess_encodingshtml_entities_unescapesprocess_control_charssstr_eq(t__doc__R1t	itertoolsR7R
tImportErrorRtkitchenRtkitchen.pycompat24Rtkitchen.text.exceptionsRtadd_builtin_setRR#trangeR!timapR/R$tcompileR9RRRR,R:R=R>t__all__(((s5/usr/lib/python2.7/site-packages/kitchen/text/misc.pyt<module>s0

,)/8	*(
OHA YOOOO