Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse & Extract : unable to use a custom regex #193

Open
simonaubertbd opened this issue Nov 15, 2024 · 2 comments
Open

Parse & Extract : unable to use a custom regex #193

simonaubertbd opened this issue Nov 15, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@simonaubertbd
Copy link

Hello,

I tried to use a custom regex by adding it :
image

Result :
image

Error

pattern contains no capture groups
Show Traceback


ValueError Traceback (most recent call last)
Cell In[9], line 20
16 inlineInput1 = pd.read_csv(StringIO(inlineInput1_data)).convert_dtypes()
19 # Extract data using regex
---> 20 extract2_extracted = inlineInput1['LastName'].str.extract(r"^[a-zA-Z]+$")
21 extract2_extracted.columns = []
22 extract2 = inlineInput1.join(extract2_extracted, rsuffix="_extracted")

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\strings\accessor.py:137, in forbid_nonstring_types.._forbid_nonstring_types..wrapper(self, *args, **kwargs)
132 msg = (
133 f"Cannot use .str.{func_name} with values of "
134 f"inferred dtype '{self._inferred_dtype}'."
135 )
136 raise TypeError(msg)
--> 137 return func(self, *args, **kwargs)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\strings\accessor.py:2740, in StringMethods.extract(self, pat, flags, expand)
2738 regex = re.compile(pat, flags=flags)
2739 if regex.groups == 0:
-> 2740 raise ValueError("pattern contains no capture groups")
2742 if not expand and regex.groups > 1 and isinstance(self._data, ABCIndex):
2743 raise ValueError("only one regex group is supported with Index")

ValueError: pattern contains no capture groups

Not sure I use it right (maybe a documentation issue?)

Best regards,

Simon

@tgourdel
Copy link
Contributor

Hi Simon, yes indeed in Python, regex expects the "caputre group" which is the parentheses to let the program know where to catch the value or values, in your case:
^([a-zA-Z]+)$

definitely a documentation issue!

@simonaubertbd
Copy link
Author

@tgourdel And I'm definitely not a Python dev ! ;)

@tgourdel tgourdel added the enhancement New feature or request label Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants