View previous topic :: View next topic |
Author |
Message |
EliasJonsson n00b

Joined: 18 Oct 2017 Posts: 20
|
Posted: Thu Aug 27, 2020 6:39 pm Post subject: Building the linux kernel with custom CFLAGS |
|
|
Dear Gentoo community,
I am building the kernel on a Raspberryu Pi 4. I have tried to add custom cflags to the variable KBUILD_HOSTCFLAGS in the Makefile in the kernel main directory, That did'nt work, none of them show up if I compile with:
I have tried compiling with the
Code: | KBUILD_CFLAGS="-O2 -pipe -march=armv8-a+crc+simd -mtune=cortex-a72 -mfloat-abi=hard -mfpu=vfp -meabi=5 -fomit-frame-pointer -ftree-vectorize -fpredictive-commoning" make V=1 -j4 |
But niether did that work.
Also tried adding the
Code: | ccflags-y := -O2 -pipe -march=armv8-a+crc+simd -mtune=cortex-a72 -mfloat-abi=hard -mfpu=vfp -meabi=5 -fomit-frame-pointer -ftree-vectorize -fpredictive-commoning |
in the main Makefile but neither did that work.
Is there a way to compile the kernel with custom CFLAGS? |
|
Back to top |
|
 |
Ionen Veteran

Joined: 06 Dec 2018 Posts: 1590
|
Posted: Thu Aug 27, 2020 6:43 pm Post subject: |
|
|
I recall you need to use make KCFLAGS="..." if that still works but don't do it if it can be avoided, things like -ftree-vectorize definitely have no business being there (not familiar with building kernels for the Pi but pretty sure you don't need to set any CFLAGS yourself). |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 46999 Location: 56N 3W
|
Posted: Thu Aug 27, 2020 7:08 pm Post subject: |
|
|
EliasJonsson,
Read the Makefile ...
Code: | $ grep -A3 "as the last" /usr/src/linux/Makefile
# Add user supplied CPPFLAGS, AFLAGS and CFLAGS as the last assignments
KBUILD_CPPFLAGS += $(KCPPFLAGS)
KBUILD_AFLAGS += $(KAFLAGS)
KBUILD_CFLAGS += $(KCFLAGS) |
Don't use. Code: | -mfloat-abi=hard -mfpu=vfp | Those flags are not for Code: | -march=armv8-a+crc+simd |
If you are lucky, they will be ignored in 64 bit mode.
Some of the other flags look scary too. That's the open source way. When it breaks, you can keep the pieces.
== Edit ==
+simd is implied by -march=aarchv8-a. Its not an optional extra. +crc is correct. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
EliasJonsson n00b

Joined: 18 Oct 2017 Posts: 20
|
Posted: Fri Aug 28, 2020 3:53 am Post subject: |
|
|
As NeddySeagoon said, setting the KBUILD_CPPFLAGS and KBUILD_CFLAGS in the main Makefile did the trick.
Omitted a few of the optimization flags as per your recommendations.
Thank you very much for your support! |
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 210
|
Posted: Fri Aug 28, 2020 8:25 am Post subject: |
|
|
I remember there was a patch in the experimental part of genpatches that allowed you to set your -march via the "Processor family" setting in menuconfig that was removed not too long ago. What is now the proper way of compiling with the proper -march for your CPU? _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
EliasJonsson n00b

Joined: 18 Oct 2017 Posts: 20
|
Posted: Sat Aug 29, 2020 7:24 am Post subject: |
|
|
I tried building the kernel using a few variations of optimization flags. The one I found working (building successfully and running perfectly) for the Raspberry Pi 4 with the most optimization was:
Code: | KCFLAGS="-O2 -pipe -march=armv8-a+crc -mtune=cortex-a72 -fpredictive-commoning -ftree-vectorize" make V=1 -j4 |
Trying to build the kernel with -fomit-frame-pointer was unsuccessful.
Marcih, the way as shown above I think is the correct way to apply the -march flag when building the kernel. |
|
Back to top |
|
 |
Marcih Apprentice


Joined: 19 Feb 2018 Posts: 210
|
Posted: Sun Sep 13, 2020 6:36 am Post subject: |
|
|
EliasJonsson wrote: | Marcih, the way as shown above I think is the correct way to apply the -march flag when building the kernel. | Cheers! I've tried running "make -n" with KCFLAGS set and I got the following result (using quote tags so I can highlight with the bold tag):
Quote: | [snip]
./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -std=gnu89 -fno-PIE -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1 -falign-loops=1 -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mindirect-branch=thunk-extern -mindirect-branch-register -DRETPOLINE -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-int-in-bool-context -Wno-address-of-packed-member -Wno-attribute-alias -O2 --param=allow-store-data-races=0 -DCC_HAVE_ASM_GOTO -Wframe-larger-than=2048 -fno-stack-protector -Wno-unused-but-set-variable -Wno-unused-const-variable -fomit-frame-pointer -fno-var-tracking-assignments -Wdeclaration-after-statement -Wno-pointer-sign -Wno-stringop-truncation -Wno-array-bounds -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -fno-strict-overflow -fno-merge-all-constants -fmerge-constants -fno-stack-check -fconserve-stack -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -fcf-protection=none -Wno-packed-not-aligned -O2 -pipe -march=native
[snip] |
The KCFLAGS seem to have worked as evidenced by the "-O2 -pipe -march=native" I highlighted at the very end.
It seems that, by default, the kernel already has -mtune=generic and -O2 set, also highlighted, meaning that my -O2 is redundant, no? What about the "-mtune=generic"? I know that -march implies -mtune unless specified otherwise. As far as I understand, this would compile the kernel that runs only on my native architecture but with none of the optimizations; a bit of a loss-loss if you ask me... How do I override the "-mtune=generic"? Do I just add "-mtune=native" to my KCFLAGS when compiling? _________________
Bones McCracker wrote: | It wouldn't be so bad, if it didn't suck. |
NeddySeagoon wrote: | The problem with leaving is that you can only do it once and it reduces your influence. |
|
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 46999 Location: 56N 3W
|
Posted: Sun Sep 13, 2020 2:18 pm Post subject: |
|
|
Marcih,
What -mcpu, -mtune and -march do varies with arch.
If you know what you want for amd64, you need to learn it again for arm64.
We need to know what CPU architecture you have in mind. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
Hu Moderator

Joined: 06 Mar 2007 Posts: 16456
|
Posted: Sun Sep 13, 2020 5:33 pm Post subject: |
|
|
Yes, -O2 is redundant if already specified and not otherwise changed by any other -O options in the intervening text.
I expect that -mtune=native would do as you intend. However, I think that it is not saying that there are no optimizations. Rather, it is saying that it may skip an optimization that works well for your chosen CPU architecture, if the compiler authors are aware that other allowed CPU architectures perform poorly with that optimization. For example, suppose that in BARv2, there existed an instruction frob that does something useful far more efficiently than open-coding that instruction. In BARv1, the effect must be open-coded because the instruction did not exist. -march=BARv2 would allow use of frob. Suppose further that in BARv3, the chip designers decided that frob took too much silicon, and they removed the instruction, but shipped a kernel hack that knows how to emulate it when the now illegal instruction is it. Code running on BARv3 may wish to avoid frob and open-code the work, since the emulation is slow. Code running on BARv2 should definitely use frob, because it is hardware accelerated. With mtune=generic, the compiler might decide to use the open-coded version because the performance loss on BARv2 is not as bad as the gain on BARv3. It also might decide to use frob anyway, depending on the value judgments made by the compiler authors and what biases they gave the compiler. If you used mtune=barv2, that would hint to the compiler that you care most about performance on BARv2 CPUs and would prefer it make decisions accordingly, even when those decisions produce code that is worse for BARv3. Conversely, you might explicitly set march=barv2 mtune=barv3 to say that the code must run on both v2 and v3 CPUs (so v3-specific instructions are not allowed), but that you expect to run on v3, so bias decisions toward optimal performance on v3, even at the expense of v2. |
|
Back to top |
|
 |
|