Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new architecture mode: 'avx512-sr'. #4025

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

mulugetam
Copy link
Contributor

This PR adds a new architecture mode to support the new extensions to AVX512, namely AVX512-FP16, which have been available since Intel® Sapphire Rapids.

This PR is a prerequisite for PR#4020 that speeds up hamming distance evaluations.

@mengdilin
Copy link
Contributor

Hmm weird the CIs are not running on the PR. Do you mind pushing a new commit and see if the CIs start?

Signed-off-by: Mulugeta Mammo <[email protected]>
@mulugetam
Copy link
Contributor Author

@mengdilin Pushed a new commit but CI not starting. Is this possibly because I updated .github/workflows/build.yml?

@mengdilin
Copy link
Contributor

Yea there is a syntax error in the build file: see https://github.com/facebookresearch/faiss/actions/runs/11804384709

@@ -67,6 +67,17 @@ jobs:
uses: ./.github/actions/build_cmake
with:
opt_level: avx512
linux-x86_64-AVX512-cmake:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linux-x86_64-AVX512-advanced-cmake

@mulugetam
Copy link
Contributor Author

@mengdilin CI is picking g++ version 11. But AVX512-FP16 (-mavx512fp16) requires version 12+.

@mengdilin
Copy link
Contributor

Ah yea the conda publication CI is using the older compiler version. We should investigate on our side; however, I don't think we want to publish this architecture mode to conda right now, can you omit it?

Looks like CI failure is coming from a unit test failure from

def test_ivf_train_2level(self):

Try commenting out that test and see if anything else fails?
I'm going on PTO, handing this over to the next performance oncall @kuarora

-DFAISS_ENABLE_GPU=OFF \
-DFAISS_ENABLE_PYTHON=OFF \
-DBLA_VENDOR=Intel10_64lp \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_BUILD_TYPE=Release .

make -C _build -j$(nproc) faiss faiss_avx2 faiss_avx512
make -C _build -j$(nproc) faiss faiss_avx2 faiss_avx512 faiss_avx512_sr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used for faiss's conda packaging upload. I don't think we want to expose this build mode yet in conda officially. Can you omit this for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@mulugetam
Copy link
Contributor Author

Thanks @mengdilin. @kuarora Could you please review?

@alexanderguzhva
Copy link
Contributor

@mulugetam I would use -march=sapphirerapids -mtune=sapphirerapids for the compiler flags, because SR supports many other AVX512 instruction extensions that are not currently listed among compiler flags

Copy link
Contributor

@mengdilin mengdilin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back from PTO. LGTM, can you resolve the conflict so I can import the PR. For the unit test, can you relax the threshold instead of commenting it out? If somehow you cannot relax this threshold, you can skip this unittest when in SR mode similar to

@unittest.skipIf(
(ideally we don't have to do this)

endif()
if(NOT WIN32)
# Architecture mode to support AVX512 extensions available since Intel(R) Sapphire Rapids.
# Ref: https://networkbuilders.intel.com/solutionslibrary/intel-avx-512-fp16-instruction-set-for-intel-xeon-processor-based-products-technology-guide
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the ref!

@@ -568,7 +569,7 @@ def test_ivf_train_2level(self):
# normally 47 / 200 differences
ndiff = (Iref != Inew).sum()
self.assertLess(ndiff, 51)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there away to relax the threshold such that this test passes in SR mode as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants