You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
trying to scrape from google, I followed your blogpost on 3 lines google scraping and got the following error:
AttributeError Traceback (most recent call last)
Cell In[2], line 1
----> 1 results = seo.get_serps("stupid")
2 print(results)
File c:\Users\stephan.rudolph\Coding\testenv\Lib\site-packages\ecommercetools\seo\google_search.py:144, in get_serps(query, output)
133 """Return the first 10 Google search results for a given query.
134
135 Args:
(...)
140 results (dict): Results of query.
141 """
143 response = _get_results(query)
--> 144 results = _parse_search_results(response)
146 if results:
147 if output == "dataframe":
File c:\Users\stephan.rudolph\Coding\testenv\Lib\site-packages\ecommercetools\seo\google_search.py:124, in _parse_search_results(response)
118 output = []
120 for result in results:
121 item = {
122 'title': result.find(css_identifier_title, first=True).text,
123 'link': result.find(css_identifier_link, first=True).attrs['href'],
--> 124 'text': result.find(css_identifier_text, first=True).text
...
125 }
127 output.append(item)
129 return output
AttributeError: 'NoneType' object has no attribute 'text'
then i tried your other blogpost scrape with python, which is not relying on the ecommercetools package, and followed it to the T.
here is the interesting part:
results = google_search("stupid")
results
yields normal output, rerunning this (jupyter cell) with keyword
results = google_search("allergy")
results
yields
AttributeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 results = google_search("allergy")
2 results
Cell In[8], line 3, in google_search(query)
1 def google_search(query):
2 response = get_results(query)
----> 3 return parse_results(response)
Cell In[7], line 17, in parse_results(response)
10 output = []
12 for result in results:
14 item = {
15 'title': result.find(css_identifier_title, first=True).text,
16 'link': result.find(css_identifier_link, first=True).attrs['href'],
---> 17 'text': result.find(css_identifier_text, first=True).text
18 }
20 output.append(item)
22 return output
AttributeError: 'NoneType' object has no attribute 'text'
So sometimes, the result.find(css_identifier_text, first=True): yields True , but NoneType ??
I have no Idea, under which circumstances this NoneType arises, but the behavior is as follows:
the seo.get_serps() from ecommercetools consistently throws the error, the "hand written" equivalent is keyword sensitive, e.g. "allergy" throws the error, "keyword sensitive" does not.
The text was updated successfully, but these errors were encountered:
stRudolph
changed the title
response from _get_results(query) yields NoneType which leads to parsing Fail
response from _get_results(query) contains NoneType which leads to parsing Fail
Feb 9, 2023
Hi Matt,
trying to scrape from google, I followed your blogpost on 3 lines google scraping and got the following error:
then i tried your other blogpost scrape with python, which is not relying on the ecommercetools package, and followed it to the T.
here is the interesting part:
yields normal output, rerunning this (jupyter cell) with keyword
yields
So sometimes, the
result.find(css_identifier_text, first=True):
yieldsTrue
, butNoneType
??I have no Idea, under which circumstances this
NoneType
arises, but the behavior is as follows:the seo.get_serps() from ecommercetools consistently throws the error, the "hand written" equivalent is keyword sensitive, e.g. "allergy" throws the error, "keyword sensitive" does not.
The text was updated successfully, but these errors were encountered: