You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using Jupyter notebook to run the script. I used the example from this site, but with an actual company website. This is on windows 10 using the latest version of Anaconda.
What am I doing incorrectly?
Input:
from seoanalyzer import analyze
site = 'http://www.site.com'
sitemap = None
output = analyze(site, sitemap)
print(output)
C:\ProgramData\Anaconda3\lib\site-packages\seoanalyzer\analyzer.py in analyze(url, sitemap_url)
15 site = Website(url, sitemap_url)
16
---> 17 site.crawl()
18
19 for p in site.crawled_pages:
I am using Jupyter notebook to run the script. I used the example from this site, but with an actual company website. This is on windows 10 using the latest version of Anaconda.
What am I doing incorrectly?
Input:
from seoanalyzer import analyze
site = 'http://www.site.com'
sitemap = None
output = analyze(site, sitemap)
print(output)
Results:
UnicodeDecodeError Traceback (most recent call last)
in
4 sitemap = None
5
----> 6 output = analyze(site, sitemap)
7 print(output)
C:\ProgramData\Anaconda3\lib\site-packages\seoanalyzer\analyzer.py in analyze(url, sitemap_url)
15 site = Website(url, sitemap_url)
16
---> 17 site.crawl()
18
19 for p in site.crawled_pages:
C:\ProgramData\Anaconda3\lib\site-packages\seoanalyzer\website.py in crawl(self)
63 continue
64
---> 65 page.analyze()
66
67 self.content_hashes[page.content_hash].add(page.url)
C:\ProgramData\Anaconda3\lib\site-packages\seoanalyzer\page.py in analyze(self, raw_html)
170 return
171 else:
--> 172 raw_html = page.data.decode('utf-8')
173
174 self.content_hash = hashlib.sha1(raw_html.encode('utf-8')).hexdigest()
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 31608-31609: invalid continuation byteAdd any other context about the problem here.
The text was updated successfully, but these errors were encountered: