Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding description info to the fileDsc seciton in DDI CodeBook. #5051 #10938

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

landreev
Copy link
Contributor

@landreev landreev commented Oct 18, 2024

What this PR does / why we need it:

Apparently, users have been asking for this since 2018 - for tabular files that have the Description field populated, this label was never exported in the DDI (non-ingested files always had their descriptions exported, in the corresponding <otherMat> sections).
There is no obvious field under <fileDscr> in the DDI Codebook schema for it - probably the reason we chose not to export it back in the day (?) - but putting it into another dedicated free text <note> field seems like a reasonable solution.

The RestAssured export tests are passing, so un-drafting the PR.

I kept the changes minimal to stay under the "3" estimate.

Which issue(s) this PR closes:

Special notes for your reviewer:

Suggestions on how to test this:

Straightforward. Upload some file that's known to be ingestable (Stata, CSV ... doesn't matter). Populate the description field in the file metadata. Publish the dataset. Look at the DDI export, the description should not be showing in the corresponding <fileDscr ...>, like this:

<notes level="file" type="DATAVERSE:FILEDESC" subject="DataFile Description">
   This is a tabular file produced from a Stata .dta file with rich descriptive metadata
</notes>

For extra credit, look at the file under Data Explorer, verify that new <notes> element isn't causing any trouble there (the Explorer relies on the DDI for viewing and - in the latest version - editing).
Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

@coveralls
Copy link

Coverage Status

coverage: 20.867% (-0.001%) from 20.868%
when pulling 84e0fad on 5051-ddi-tabular-file-description
into d039a10 on develop.

Copy link

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:5051-ddi-tabular-file-description
ghcr.io/gdcc/configbaker:5051-ddi-tabular-file-description

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks fine. I'm trusting that "notes" is the right place to put the file descriptions. I did leave a couple other comments.

@cmbz cmbz added FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) Size: 3 A percentage of a sprint. 2.1 hours. FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) labels Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) Size: 3 A percentage of a sprint. 2.1 hours.
Projects
Status: Ready for Review ⏩
Development

Successfully merging this pull request may close these issues.

File description metadata of ingested files are not in the DDI exported metadata
4 participants