Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SitemapReader originally developed in OERSI #469

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fsteeg
Copy link
Member

@fsteeg fsteeg commented Sep 22, 2022

Reads sitemap from URL, sends each loc URL to the receiver.

e.g. "https://hoou.de/sitemap.xml" | read-sitemap | open-http ... in a Flux workflow to process every document linked in the sitemap.

Supports paging via from= query string parameter in the sitemap URL.

Assigning @dr0i for code review due to the (albeit loose) paging relation to #464.

We don't have a dedicated issue for this, maybe @TobiasNx could do functional review here?

Reads sitemap from URL, sends each `loc` URL to the receiver.

e.g. `"https://hoou.de/sitemap.xml" | read-sitemap | open-http ...`
in a Flux workflow to process every document linked in the sitemap.

Supports paging via `from=` query string parameter (see #464)

See:

https://en.wikipedia.org/wiki/Sitemaps
https://gitlab.com/oersi/oersi-etl/-/issues/4
https://gitlab.com/oersi/oersi-etl/-/issues/17
@dr0i dr0i removed their assignment Sep 23, 2022
@fsteeg
Copy link
Member Author

fsteeg commented Sep 23, 2022

Discussed in our planning meeting: we're putting this on hold to investigate if we actually need this kind of specific module for reading sitemaps, or if we can build something based on existing modules and the upcoming paging support (#464).

@sonarcloud
Copy link

sonarcloud bot commented Mar 3, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants