|
Text.Regex.Posix.Wrap | Portability | non-portable (regex-base needs MPTC+FD) | Stability | experimental | Maintainer | libraries@haskell.org, textregexlazy@personal.mightyreason.com |
|
|
|
|
|
Description |
WrapPosix.hsc exports a wrapped version of the ffi imports. To
increase type safety, the flags are newtype'd. The other important
export is a Regex type that is specific to the Posix library
backend. The flags are documented in Text.Regex.Posix. The
defaultCompOpt is (compExtended .|. compNewline).
The Regex, CompOption, and ExecOption types and their RegexOptions
instance is declared. The =~ and =~~ convenience functions are
defined.
The exported symbols are the same whether 1 is defined, but
when it is not defined then getVersion == Nothing and all other
exported values will call error or fail.
This module will fail or error only if allocation fails or a nullPtr
is passed in.
2009-January : wrapMatchAll and wrapCount now adjust the execution
option execNotBOL after the first result to take into account '\n'
in the text immediately before the next matches. (version 0.93.3)
2009-January : wrapMatchAll and wrapCount have been changed to
return all non-overlapping matches, including empty matches even if
they coincide with the end of the previous non-empty match. The
change is that the first non-empty match no longer terminates the
search. One can filter the results to obtain the old behavior or
to obtain the behavior of sed, where sed eliminates the empty
matches which coincide with the end of non-empty matches. (version
0.94.0)
|
|
Synopsis |
|
|
|
|
High-level API
|
|
|
A compiled regular expression.
|
|
|
|
RegOffset is typedef int regoff_t on Linux and ultimately typedef
long long __int64_t on Max OS X. So rather than saying
2,147,483,647 is all the length you need, I'll take the larger:
9,223,372,036,854,775,807 should be enough bytes for anyone, no
need for Integer. The alternative is to compile to different sizes
in a platform dependent manner with type RegOffset = (, which I do not want to do.
There is also a special value unusedRegOffset :: RegOffset which is
(-1) and as a starting index means that the subgroup capture was
unused. Otherwise the RegOffset indicates a character boundary that
is before the character at that index offset, with the first
character at index offset 0. So starting at 1 and ending at 2 means
to take only the second character.
|
|
|
|
|
|
Low-level API
|
|
|
The return code will be retOk when it is the Haskell wrapper and
not the underlying library generating the error message.
|
|
|
|
|
|
|
|
wrapMatch returns offsets for the begin and end of each capture.
Unused captures have offsets of unusedRegOffset which is (-1)
|
|
|
wrapMatchAll returns the offset and length of each capture.
Unused captures have an offset of unusedRegOffset which is (-1) and
length of 0.
|
|
|
|
Miscellaneous
|
|
|
|
Compilation options
|
|
|
A bitmapped CInt containing options for compilation of regular
expressions. Option values (and their man 3 regcomp names) are
- compBlank which is a completely zero value for all the flags.
This is also the blankCompOpt value.
- compExtended (REG_EXTENDED) which can be set to use extended instead
of basic regular expressions.
This is set in the defaultCompOpt value.
- compNewline (REG_NEWLINE) turns on newline sensitivity: The dot (.)
and inverted set [^ ] never match newline, and ^ and $ anchors do
match after and before newlines.
This is set in the defaultCompOpt value.
- compIgnoreCase (REG_ICASE) which can be set to match ignoring upper
and lower distinctions.
- compNoSub (REG_NOSUB) which turns off all information from matching
except whether a match exists.
| Constructors | |
|
|
|
A completely zero value for all the flags.
This is also the blankCompOpt value.
|
|
|
|
|
|
|
|
|
|
Execution options
|
|
|
A bitmapped CInt containing options for execution of compiled
regular expressions. Option values (and their man 3 regexec names) are
- execBlank which is a complete zero value for all the flags. This is
the blankExecOpt value.
- execNotBOL (REG_NOTBOL) can be set to prevent ^ from matching at the
start of the input.
- execNotEOL (REG_NOTEOL) can be set to prevent $ from matching at the
end of the input (before the terminating NUL).
| Constructors | |
|
|
|
A completely zero value for all the flags.
This is also the blankExecOpt value.
|
|
|
|
|
|
Return codes
|
|
|
ReturnCode is an enumerated CInt, corresponding to the error codes
from man 3 regex:
- retBadbr (REG_BADBR) invalid repetition count(s) in { }
- retBadpat (REG_BADPAT) invalid regular expression
- retBadrpt (REG_BADRPT) ?, *, or + operand invalid
- retEcollate (REG_ECOLLATE) invalid collating element
- retEctype (REG_ECTYPE) invalid character class
- retEescape (REG_EESCAPE) \ applied to unescapable character
- retEsubreg (REG_ESUBREG) invalid backreference number
- retEbrack (REG_EBRACK) brackets [ ] not balanced
- retEparen (REG_EPAREN) parentheses ( ) not balanced
- retEbrace (REG_EBRACE) braces { } not balanced
- retErange (REG_ERANGE) invalid character range in [ ]
- retEspace (REG_ESPACE) ran out of memory
- retNoMatch (REG_NOMATCH) The regexec() function failed to match
| Constructors | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Produced by Haddock version 2.6.0 |