Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detected mime type of m4a uploaded in zip #10923

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

jo-pol
Copy link
Contributor

@jo-pol jo-pol commented Oct 14, 2024

What this PR does / why we need it:

store the proper mime type (and show the previewer) for m4a files uploaded in a zip

Which issue(s) this PR closes:

Special notes for your reviewer:

Suggestions on how to test this:

See issue

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

files with extension m4a uploaded in a zip will get content-type audio/x-m4a rather than application/octet-stream

Additional documentation:

@pdurbin
Copy link
Member

pdurbin commented Oct 15, 2024

Huh, the API tests passed but afterward there was an Ansible error at https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/job/PR-10923/1/console

Ansible error

Branch 10922-content-type-of-m4a-in-zip from https://github.com/DANS-KNAW-jp/dataverse has been deployed to http://ec2-3-95-195-223.compute-1.amazonaws.com/
destroying AWS instance
{
"TerminatingInstances": [
{
"CurrentState": {
"Code": 32,
"Name": "shutting-down"
},
"InstanceId": "i-05db8c82e47c5a34f",
"PreviousState": {
"Code": 16,
"Name": "running"
}
}
]
}
removing EC2 PEM
[Pipeline] }
[Pipeline] // stage
[Pipeline] stage
[Pipeline] { (Post)
[Pipeline] junit
Recording test results
[Checks API] No suitable checks publisher found.
[Pipeline] jacoco
[JaCoCo plugin] Collecting JaCoCo coverage data...
[JaCoCo plugin] Version: 3.3.6
[JaCoCo plugin] target/.exec;target/classes;src/main/java; locations are configured
[JaCoCo plugin] Number of found exec files for pattern target/
.exec: 1
[JaCoCo plugin] Saving matched execfiles: /home/worker/workspace/SS-Dataverse-Develop-PR_PR-10923/target/jacoco-unit.exec
[JaCoCo plugin] Saving matched class directories for class-pattern: target/classes:
[JaCoCo plugin] - /home/worker/workspace/SS-Dataverse-Develop-PR_PR-10923/target/classes 1124 files
[JaCoCo plugin] Saving matched source directories for source-pattern: src/main/java:
[JaCoCo plugin] Source Inclusions: /*.java,/.groovy,**/.kt,**/.kts
[JaCoCo plugin] Source Exclusions:
[JaCoCo plugin] - /home/worker/workspace/SS-Dataverse-Develop-PR_PR-10923/src/main/java 919 files
[JaCoCo plugin] Loading inclusions files..
[JaCoCo plugin] inclusions: []
[JaCoCo plugin] exclusions: [src/test
]
[JaCoCo plugin] Thresholds: JacocoHealthReportThresholds [minClass=0, maxClass=0, minMethod=0, maxMethod=0, minLine=0, maxLine=0, minBranch=0, maxBranch=0, minInstruction=0, maxInstruction=0, minComplexity=0, maxComplexity=0]
[JaCoCo plugin] Publishing the results..
[JaCoCo plugin] Loading packages..
[JaCoCo plugin] Done.
[JaCoCo plugin] Overall coverage: class: 44.10933, method: 24.452742, line: 20.689976, branch: 16.892052, instruction: 21.34378, complexity: 16.780916
[Pipeline] script
[Pipeline] {
[Pipeline] fileExists
[Pipeline] error
[Pipeline] }
[Pipeline] // script
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: Ansible run terminated abnormally, failing build.

GitHub has been notified of this commit’s build result

Finished: FAILURE

It seems unrelated to this PR. Thanks for the PR, @jo-pol

@pdurbin pdurbin added Size: 3 A percentage of a sprint. 2.1 hours. Type: Bug a defect labels Oct 15, 2024
@pdurbin pdurbin changed the title mime type of m4a uploaded in zip detected mime type of m4a uploaded in zip Oct 15, 2024
@cmbz cmbz added FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) labels Oct 23, 2024
@qqmyers
Copy link
Member

qqmyers commented Oct 25, 2024

@jo-pol - given https://mimetype.io/audio/x-m4a - should this use audio/mp4 instead? I'm not certain, but I think using audio/mp4 would make files uploaded as audio/x-m4a end up with the audio/mp4 type. (changing existing files would require running the redetect API). I see we have both in the MimeTypeDisplay file:

audio/mp4=MPEG-4 Audio
audio/x-m4a=MPEG-4 Audio
but we currently don't have an example with audio/mp4 for the Previewers.

Also note that the jenkins build problem here is the same as on #10899 - somehow the version of count.pl being picked up by ansible is different that the one in the IQSS/develop branch:

TASK [dataverse : place Leonid's PERL scripts] *********************************
changed: [localhost] => (item=https://raw.githubusercontent.com/DANS-KNAW-jp/dataverse/develop/scripts/database/querycount/parse.pl)
changed: [localhost] => (item=https://raw.githubusercontent.com/DANS-KNAW-jp/dataverse/develop/scripts/database/querycount/count.pl)

TASK [dataverse : get name of Postgres logfile] ********************************
ok: [localhost]

TASK [dataverse : there should be only one] ************************************
ok: [localhost]

TASK [dataverse : fire at will] ************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "/usr/bin/perl count.pl --non-interactive /var/lib/pgsql/13/data/log/postgresql-Mon.log > /tmp/query_count.out", "delta": "0:00:00.045359", "end": "2024-10-14 10:54:38.761200", "msg": "non-zero return code", "rc": 2, "start": "2024-10-14 10:54:38.715841", "stderr": "usage: ./count.pl <PGLOGFILE>", "stderr_lines": ["usage: ./count.pl <PGLOGFILE>"], "stdout": "", "stdout_lines": []}

@qqmyers
Copy link
Member

qqmyers commented Oct 25, 2024

Re - the build failure - it looks like this is from a bug in dataverse-ansible and doesn't have anything to do with your branches. We're working on a fix but won't hold up the PR for that once we resolve the audio/mp4 x-m4a issue.

@jo-pol
Copy link
Contributor Author

jo-pol commented Oct 31, 2024

Re (x-)m4a
When uploaded stand alone firefox provides mp4, chrome and edge x-m4a I simply followed the major browsers. However, https://mimetype.io/audio/x-m4a seems pretty clear.

Having said that and beyond the scope of the issue: shouldn't we overrule the browser provided mime type in FileUtil.useRecognizedType for deprecated types?

A second aside: it looks to me that CreateNewDataFilesCommand does a pretty inefficient job scanning a file multiple times to recognize the type and throwing that result away to use the supplied type.

@qqmyers
Copy link
Member

qqmyers commented Oct 31, 2024

How about adding both with a comment that the x- version is obsolete but still used in 2024? That would support all browser now (assuming people register the player for both types).

Re: shouldn't we overrule the browser provided mime type in FileUtil.useRecognizedType for deprecated types? - probably, the note on useRecognizedType indicates this was thought about (in general, not specific to obsolete types) and just not implemented so far.

Re: it looks to me that CreateNewDataFilesCommand does a pretty inefficient job scanning a file multiple times to recognize the type and throwing that result away to use the supplied type. - It's hard to tell, but I think any given file is only scanned once, but there are different paths through that command, so there are different places where we recognize normally uploaded files, direct uploaded files, files in zips, shapefile components, etc. It does look like we want to always try recognizing once, since we could discover the real type is one of the ingestible ones even though the browser says otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) Size: 3 A percentage of a sprint. 2.1 hours. Type: Bug a defect
Projects
Status: In Review 🔎
Development

Successfully merging this pull request may close these issues.

m4a files uploaded in a zip get wrong mimetype
4 participants