-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated spark with scala and python #36997
base: main
Are you sure you want to change the base?
Conversation
Gen AI suggestions to solve the build error: • Detected Error: The build seems to halt after attempting to download Scala and Maven without an explicit error message. The pipeline doesn't proceed past the curl commands. • Error Category: Build Configuration/Environment • Failure Point: During the make-distribution.sh script execution when trying to download build dependencies • Root Cause Analysis: The script is attempting to download Scala and Maven but likely failing silently. The environment appears to be missing required Maven configuration. • Suggested Fix:
environment:
contents:
packages:
- maven-wrapper
- scala
# existing packages...
environment:
LANG: en_US.UTF-8
M2_HOME: /usr/share/maven
MAVEN_OPTS: "-Xmx2048m -XX:ReservedCodeCacheSize=512m" • Explanation: The build system is trying to download build tools at runtime, which can be unreliable. By providing Maven and Scala through the package manager and configuring proper Maven environment variables, we ensure the build tools are available and properly configured. • Additional Notes:
• References:
Consider this a critical fix since build reliability is essential for CI/CD pipelines. |
Gen AI suggestions to solve the build error: • Detected Error:
• Error Category: Dependency/Network • Failure Point: Maven download and verification during the build process • Root Cause Analysis: • Suggested Fix:
environment:
contents:
packages:
- maven # Use the system-provided Maven package instead Remove the explicit Maven download since Wolfi already provides Maven as a package. • Explanation: • Additional Notes:
• References:
|
Gen AI suggestions to solve the build error: Based on the error output, here's my analysis: • Detected Error:
• Error Category: Dependency/Network • Failure Point: Maven download and verification during the make-distribution.sh script execution • Root Cause Analysis: The build is failing because:
• Suggested Fix:
environment:
contents:
packages:
# Existing packages...
- maven-3.9
environment:
LANG: en_US.UTF-8
M2_HOME: /usr/share/java/maven-3.9
MAVEN_HOME: /usr/share/java/maven-3.9
PATH: /usr/share/java/maven-3.9/bin:$PATH
--- a/dev/make-distribution.sh
+++ b/dev/make-distribution.sh
@@ -166,7 +166,7 @@
-MVN="$SPARK_HOME/build/mvn"
+MVN="mvn" • Explanation:
• Additional Notes:
• References: |
Gen AI suggestions to solve the build error: • Detected Error:
• Error Category: Dependency/Network • Failure Point: Maven download and verification during the build process • Root Cause Analysis: The build is failing because it cannot download Maven 3.9.6 from Apache's archive server and consequently cannot verify its checksum. This appears to be either a network connectivity issue or a temporary outage of the Apache archive server. • Suggested Fix:
environment:
contents:
packages:
# existing packages...
environment:
# existing environment vars...
PATH: /usr/share/maven/bin:$PATH
M2_HOME: /usr/share/maven
• Explanation: The build environment already includes Maven 3.9 from Wolfi's package repository ( • Additional Notes:
• References: |
Gen AI suggestions to solve the build error: • Detected Error: The build is failing while trying to download Maven and Scala dependencies during the build process. • Error Category: Dependency/Download • Failure Point: In the make-distribution.sh script when trying to download Maven and Scala dependencies • Root Cause Analysis: The build is attempting to download Maven and Scala from external sources, but appears to be failing. This is likely because we already have Maven installed in the build environment and should use that instead. • Suggested Fix:
diff --git a/dev/make-distribution.sh b/dev/make-distribution.sh
--- a/dev/make-distribution.sh
+++ b/dev/make-distribution.sh
@@ -162,6 +162,9 @@
VERSION=$VERSION
if [ -z "$JAVA_HOME" ]; then
echo "Error: JAVA_HOME is not set, cannot proceed."
exit -1
fi
+
+# Use system Maven instead of downloading
+MVN="mvn" • Explanation:
• Additional Notes:
• References:
Follow up action: Let me know if you need additional build error details after applying this fix. |
Gen AI suggestions to solve the build error: • Detected Error:
• Error Category: Dependency/Network • Failure Point: Maven download and verification step during make-distribution.sh execution • Root Cause Analysis: The build is failing because it cannot download Maven 3.9.6 from Apache's archive server, likely due to network connectivity issues or server unavailability. The subsequent checksum verification fails because the download was incomplete. • Suggested Fix:
--- a/dev/make-distribution.sh
+++ b/dev/make-distribution.sh
- # Download Maven if it hasn't been installed
- if [ ! -f "${MAVEN_PATH}/bin/mvn" ]; then
- # Download Maven
- MAVEN_VERSION=3.9.6
- wget "https://archive.apache.org/dist/maven/maven-3/${MAVEN_VERSION}/binaries/apache-maven-${MAVEN_VERSION}-bin.tar.gz"
- tar xzf "apache-maven-${MAVEN_VERSION}-bin.tar.gz"
- rm -rf "${MAVEN_PATH}"
- mv "apache-maven-${MAVEN_VERSION}" "${MAVEN_PATH}"
- fi • Explanation: Since Wolfi OS provides Maven 3.9 as a package and it's already included in the build environment packages list, we can remove the Maven download step from the make-distribution.sh script. This eliminates the network dependency and uses the system-provided Maven instead. • Additional Notes:
• References:
|
Gen AI suggestions to solve the build error: Based on the build error log, I can help diagnose and fix the issue. • Detected Error:
• Error Category: Build/Configuration • Failure Point: make-distribution.sh script trying to download Scala and Maven dependencies • Root Cause Analysis: The build is attempting to download Scala and Maven instead of using the system-provided Maven from the environment packages. This is happening because the build script isn't properly detecting or using the installed Maven package. • Suggested Fix:
environment:
environment:
LANG: en_US.UTF-8
MAVEN_HOME: /usr/share/java/maven-3.9
M2_HOME: /usr/share/java/maven-3.9
PATH: /usr/share/java/maven-3.9/bin:$PATH
SCALA_HOME: /usr/share/java/scala
• Explanation: By explicitly setting the Maven and Scala environment variables, we ensure the build system uses the packaged versions instead of trying to download them. This addresses the underlying issue while maintaining the build integrity. • Additional Notes:
• References:
|
Gen AI suggestions to solve the build error: • Detected Error:
• Error Category: Dependency/Network • Failure Point: Maven download and verification step in make-distribution.sh • Root Cause Analysis: The build is failing because it cannot download Maven 3.9.6 from Apache's archive server and verify its checksum. This appears to be either a network connectivity issue or potentially a temporary outage of the Apache archive server. • Suggested Fix:
--- a/dev/make-distribution.sh
+++ b/dev/make-distribution.sh
@@ -89,8 +89,8 @@
-MVN="build/apache-maven-${MVN_VERSION}/bin/mvn"
+MVN="/usr/share/java/maven-3.9/bin/mvn" • Explanation:
• Additional Notes:
• References:
|
Gen AI suggestions to solve the build error: Based on the build error output, I can help diagnose and fix the issue. • Detected Error: The build is failing during the curl commands to download Scala and Maven dependencies • Error Category: Dependency • Failure Point: During the make-distribution.sh script execution when trying to download external dependencies • Root Cause Analysis: The build script is trying to download Scala and Maven binaries but appears to be failing silently. This could be due to:
• Suggested Fix:
diff --git a/dev/make-distribution.sh b/dev/make-distribution.sh
index a123456..b123456 100755
--- a/dev/make-distribution.sh
+++ b/dev/make-distribution.sh
@@ -166,7 +166,10 @@
local url=$1
local output=$2
- curl --silent --show-error -L "$url" > "$output"
+ curl --silent --show-error -L --retry 3 --retry-delay 2 \
+ --max-time 60 --fail "$url" > "$output" || {
+ echo "Failed to download $url" >&2
+ return 1
+ } • Explanation:
• Additional Notes:
• References:
|
Gen AI suggestions to solve the build error: • Detected Error: "patching file dev/make-distribution.sh • Error Category: Configuration • Failure Point: Patch application step in the pipeline • Root Cause Analysis: The patch file make-distribution.patch cannot be applied cleanly to the source code, likely because the target file content has changed in version 3.5.3 compared to what the patch was created against. • Suggested Fix:
git clone https://github.com/apache/spark
cd spark
git checkout v3.5.3
# Make your changes to dev/make-distribution.sh
git diff dev/make-distribution.sh > make-distribution.patch
• Explanation: Patch failures typically occur when the target file has changed and no longer matches the patch context. Regenerating the patch against the current version ensures the line numbers and context match exactly. • Additional Notes:
• References:
|
Gen AI suggestions to solve the build error: Based on the build error output, I can help identify and fix the issue: • Detected Error: • Error Category: Build Configuration/Dependencies • Failure Point: During make-distribution.sh script execution when trying to download Scala dependencies • Root Cause Analysis: The build script is attempting to download Scala and Maven dependencies directly, but appears to be failing silently. This is likely due to either network connectivity issues or missing curl configurations. • Suggested Fix: pipeline:
- runs: |
# Pre-download required dependencies
mkdir -p build
curl -L https://downloads.lightbend.com/scala/2.12.18/scala-2.12.18.tgz -o build/scala.tgz
curl -L https://archive.apache.org/dist/maven/maven-3/3.9.6/binaries/apache-maven-3.9.6-bin.tar.gz -o build/maven.tar.gz
# Extract dependencies
cd build
tar xf scala.tgz
tar xf maven.tar.gz
cd ..
# Now run make-distribution
./dev/make-distribution.sh ... • Explanation:
• Additional Notes:
• References: |
Gen AI suggestions to solve the build error: • Detected Error: "patching file dev/make-distribution.sh • Error Category: Configuration/Patch • Failure Point: Patch application step in the pipeline failing to apply make-distribution.patch • Root Cause Analysis: The patch file is either malformed or doesn't match the target file content at version 3.5.3, likely due to upstream changes in the make-distribution.sh script • Suggested Fix:
git clone https://github.com/apache/spark
cd spark
git checkout v3.5.3
# Make your changes to dev/make-distribution.sh
git diff dev/make-distribution.sh > make-distribution.patch
- runs: |
# Add your modifications here directly
sed -i 's/original/replacement/' dev/make-distribution.sh • Explanation: The patch is failing because the context lines in the patch file don't match the target file at version 3.5.3. Regenerating the patch against the exact version ensures the line numbers and context match perfectly. • Additional Notes:
• References:
|
Gen AI suggestions to solve the build error: • Detected Error: "Bad checksum from https://archive.apache.org/dist/maven/maven-3/3.9.6/binaries/apache-maven-3.9.6-bin.tar.gz.sha512" • Error Category: Build Configuration • Failure Point: Maven installation during build setup • Root Cause Analysis: The build is attempting to download Maven 3.9.6 but failing due to checksum verification. This is likely because the build script is trying to use a newer Maven version than what's available in the environment. • Suggested Fix:
environment:
contents:
packages:
- maven-3.9 # Use system Maven instead of downloading
environment:
M2_HOME: /usr/share/java/maven-3.9
PATH: /usr/share/java/maven-3.9/bin:$PATH
--- a/dev/make-distribution.sh
+++ b/dev/make-distribution.sh
@@ -166,6 +166,7 @@
MVN="$SPARK_HOME/build/mvn"
fi
+MVN="mvn" # Force use of system Maven • Explanation: The build is failing because it's trying to download and verify Maven 3.9.6, but we already have Maven 3.9.x installed in the system. By forcing the use of system Maven and properly setting up the Maven environment variables, we can bypass this download step and use the pre-installed version. • Additional Notes:
• References:
|
Gen AI suggestions to solve the build error: • Detected Error: "patch: **** malformed patch at line 13: @@ -166,7 +166,7 @@ export MAVEN_OPTS="${MAVEN_OPTS:--Xss128m -Xmx4g -XX:ReservedCodeCacheSize=128m}" • Error Category: Build Configuration • Failure Point: The "patch" step in the pipeline attempting to apply make-distribution.patch • Root Cause Analysis: The patch file appears to be malformed or corrupted, specifically at line 13. This is typically caused by incorrect patch formatting, line endings, or copy/paste errors. • Suggested Fix:
dos2unix make-distribution.patch # Convert line endings
--- a/dev/make-distribution.sh
+++ b/dev/make-distribution.sh
@@ -166,7 +166,7 @@
• Explanation: Patch files must follow strict formatting rules. The error indicates the diff header line is malformed, which is a common issue when patches are created or edited on different platforms or through copy/paste. • Additional Notes:
git diff --no-prefix original_file modified_file > make-distribution.patch • References:
|
There was an escalation from a customer regarding spark https://github.com/chainguard-dev/customer-issues/issues/1926
The previous
spark-3.5
package has now also been renamed tospark-3.5-scala-2.12
to reflect the earlier change to scala-2.13Related: https://github.com/chainguard-dev/customer-issues/issues/1926
Pre-review Checklist
For new package PRs only