Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Raspberry Pi 4 - Compile in RAM
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo on ARM
View previous topic :: View next topic  
Author Message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Mon Nov 18, 2019 8:28 pm    Post subject: Raspberry Pi 4 - Compile in RAM Reply with quote

Having a question regarding compile in RAM like on x86 machine with plenty on RAM using zram.
The purpose is avoid ssd wearing and speed.

What is the best way to compile in RAM using a Raspberry Pi 4 helped by an zRam disk on main PC mounted over nfs ?
I can think of the following solutions:
1. 8 - 12 GB zram disk on pi and swap over nfs;
2. 3 - 3.5 GB zram disk on pi and PORTAGE_TMPDIR="/var/tmp/notmpfs" for larger packages mounted over nfs;
3. Compile everything over nfs - slow access time over network;
4. A way using something like unionfs, aufs or overlayfs to extend zram on pi when local RAM is not enough to accomplish something like pct. 1, but, for now, this sound more like wishful thinking.

What do you think ? Thank you!
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
joanandk
Apprentice
Apprentice


Joined: 12 Feb 2017
Posts: 155

PostPosted: Tue Nov 19, 2019 5:05 pm    Post subject: Reply with quote

Hi,

I would have suggested 4 with using overlayfs for /var/tmp/portage which is mapped to /tmp/upper. BUT as RPi 4 has maximum of 4GB RAM and the compile process needs around 1GB per thread, this might lead to emerge crashes.
I have in the past used an external spinning drive on a thin client to compile. I have never succeeded in using cross-compile because I have not tried hard enough.

So my advice would be use a USB external harddisk to compile or use a faster system and cross compile which is very elegant.

BR
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 48915
Location: 56N 3W

PostPosted: Tue Nov 19, 2019 6:03 pm    Post subject: Reply with quote

joanandk,

Pure cross compiling is hard. There are lots of broken build build systems that don't support it.
cross compiling using distcc is mostly harmless.

On the helpers
Code:
emerge crossdev distcc

Code:
crossdev -t aarch64-unknown-linux-gnu
to get your cross toolchain.
Do make sure you get the same versions of gcc as the Pi has. That matters.
Configure distcc to --listen for crys for help then start distccd. (add it to the default runlevel).

On the Pi,
Code:
emerge distcc

Configure /etc/distcc/hosts with the list of helpers. Do not list the Pi. It will be kept busy doing the unpacking, preprocessing, linking and installing.
It will still be used as a fallback.
Add distcc to FEATURES.

That's the main steps. I've glossed over the detail of the configuration.
You may read about distcc pump mode. Be careful with that. It seems to produce broken binaries from time to time.
Probably because it uses more of the build hosts headers to go faster. When the Pi and build host have different headers, it doesn't work so well.

When distcc is in use, you way not use -march=native on the Pi. You need to expand -march=native.
Well, you don't want your helpers generating amd64 code do you :)
distcc used to do that. Now it won't distribute if -march=native is in use.

You can try an arm64 binhost. That's mine, Sakaki has one on github too.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Tue Nov 19, 2019 6:54 pm    Post subject: Reply with quote

First, thank you all for your replies.
Tested setups and:
- a larger than RAM size zram disk is troublesome;
- /var/tmp/portage over nfs lead to lock issues, sometimes, even with nfs3 only;
so I created a loop disk over nfs and formated it as ext4. No issues whatsoever so far, but network access is slower than RAM.
Will use it only for packages which needs more than 2GB space to compile.
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
erm67
l33t
l33t


Joined: 01 Nov 2005
Posts: 650
Location: EU

PostPosted: Mon Dec 02, 2019 7:05 pm    Post subject: Reply with quote

I always used iscsi or nbd for that, both are much better than nfs and take advantage of the linux file cache, also zswap is a lot better than zram and can be used in combination with swap over nbd and swap over iscsi if you don't want to wear the ssd, of course you need a real gigabit eth to enjoy it.

My older board only had 2Gb memory, but I was able to compile mariadb using graphite & LTO, it used 4Gb of swap buy thanks to zswap and swap over iscsi It wasn't a problem
_________________
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Back to top
View user's profile Send private message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Mon Dec 02, 2019 9:08 pm    Post subject: Reply with quote

I have to admit that is the first time I hear about nbd. Will try with iscsi when I'll have time to setup it.
Thank you for suggestions!
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
erm67
l33t
l33t


Joined: 01 Nov 2005
Posts: 650
Location: EU

PostPosted: Mon Dec 02, 2019 9:47 pm    Post subject: Reply with quote

nbd is linux specific, a lot easier to set up but probably a bit slower thana well tuned iscsi setup.
_________________
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Back to top
View user's profile Send private message
erm67
l33t
l33t


Joined: 01 Nov 2005
Posts: 650
Location: EU

PostPosted: Wed Dec 04, 2019 11:58 pm    Post subject: Reply with quote

Here I am compiling kernel 5.4.1 with MAKEFLAGS=-j6 on a S912 box otherwise idle using iscsi, as you can see it compiles mostly in ram, the swap is almost unused wait time very low and also the eth is not heavily used, I see 1.5 up to 3 MB/s maybe it works also with a 100mbps eth but extracting the sources will take a long time probably.
I am not using zswap since the box is otherwise idle.

S912 box compiling kernel
Code:
top - 23:37:25 up 10:51,  2 users,  load average: 4.06, 1.89, 1.00
Tasks: 212 total,   8 running, 124 sleeping,   0 stopped,   0 zombie
%Cpu(s): 55.0 us,  4.2 sy,  0.0 ni, 39.9 id,  0.0 wa,  0.7 hi,  0.2 si,  0.0 st
KiB Mem :  1845116 total,   791364 free,   406964 used,   646788 buff/cache
KiB Swap:  3365884 total,  3361268 free,     4616 used.  1335644 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                         
15627 1001      20   0  122832 112668  16356 R  99.0  6.1   0:15.94 cc1                                                                                                                                             
15828 1001      20   0   73280  61688  16008 R  99.0  3.3   0:04.38 cc1                                                                                                                                             
15842 1001      20   0   89708  77944  16176 R  99.0  4.2   0:04.07 cc1                                                                                                                                             
15902 1001      20   0   45312  32412  14044 R  36.2  1.8   0:01.10 cc1                                                                                                                                             
15909 1001      20   0   41456  29300  12620 R  33.9  1.6   0:01.03 cc1                                                                                                                                             
15912 1001      20   0   41412  29368  12620 R  33.2  1.6   0:01.01 cc1                                                                                                                                             
15893 1001      20   0    8800   3544   2244 S   1.6  0.2   0:00.05 make                                                                                                                                           
15900 1001      20   0    8776   3532   2256 S   1.6  0.2   0:00.05 make                                                                                                                                           
14959 root      20   0    7396   3144   2444 R   1.3  0.2   0:00.63 top                                                                                                                                             
11453 1001      20   0    9328   3968   2128 S   0.3  0.2   0:00.23 make                                                                                                                                           
13454 1001      20   0    8796   3476   2180 S   0.3  0.2   0:00.10 make                                                                                                                                           
18259 root      20   0       0      0      0 I   0.3  0.0   0:00.15 kworker/6:5-mm_                                                                                                                                 
19616 root      20   0       0      0      0 I   0.3  0.0   0:22.01 kworker/u16:2-e                                                                                                                                 

iftop on the S912 box
Code:
Screen filter>
└─────────────────────────────────────────┴─────────────────────────────────────────┴──────────────────────────────────────────┴─────────────────────────────────────────┴──────────────────────────────────────────
aml:33148                                                                                    => 192.168.1.115:iscsi-target                                                                   1.37Mb  1.08Mb   905Kb
                                                                                             <=                                                                                              13.9Kb  78.4Kb  47.4Kb

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
TX:             cum:   6.65MB   peak:   2.72Mb                                                                                                                                      rates:   1.38Mb  1.08Mb   907Kb
RX:                     297KB            175Kb                                                                                                                                               16.4Kb  79.7Kb  48.4Kb
TOTAL:                 6.94MB           2.73Mb                                                                                                                                               1.39Mb  1.16Mb   956Kb



1 iscsi session with 1 LUN and 2 partitions (for swap and /var/tmp) proved to be the fastest and most reliable configuration for me, it depends of course on how many CPUs has your nas.
Code:

root@aml:~# iscsiadm -m session
tcp: [1] 192.168.1.115:3260,1 iqn.2004-01.local.nas:work (non-flash)

root@aml:~# swapon
NAME              TYPE      SIZE USED PRIO
/dev/sda2 partition 3.2G   5M   -2
root@aml:~# mount
...
/dev/sda1 on /var/tmp type ext4 (rw,relatime)
...

load on the qnap nas is very low since most of the work is done in ram on the S912:
Code:

top - 00:38:02 up 2 days,  8:41,  1 user,  load average: 0,26, 0,33, 0,24
Tasks: 122 total,   1 running, 121 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0,7 us,  1,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
MiB Mem :    501,6 total,     14,0 free,     97,1 used,    390,4 buff/cache
MiB Swap:    488,0 total,    486,2 free,      1,8 used.    366,8 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                       
  652 root      20   0  137580   3636   2444 S   1,0   0,7   4:15.86 tgtd                                                                                                                                           
 4126 ermanno   20   0   10220   3080   2580 R   0,7   0,6   0:00.87 top                       

_________________
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Back to top
View user's profile Send private message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Thu Dec 12, 2019 7:20 pm    Post subject: Reply with quote

With iscsi compile works flawless. A bit harder to setup than nfs, but overhead is limited to network latency and no more permissions or locks errors.
Thank you for ideea erm67!
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
crocket
Guru
Guru


Joined: 29 Apr 2017
Posts: 493

PostPosted: Sat Dec 14, 2019 6:17 am    Post subject: Reply with quote

Let me summarize this.

You use zswap over iSCSI to add more memory to a single board computer? Can iSCSI expose a tmpfs mount?
Back to top
View user's profile Send private message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Sat Dec 14, 2019 8:30 am    Post subject: Reply with quote

Not exactly.
I use zram over iscsi and use it to compile gcc chromium libreoffice.
I am not sure about tmpfs, but in fact iscsi has it own ramdisk support.
I prefer zram because it grows and shrink dynamically (as tmpfs), iscsi own ramdisk use fixed size all the time.

Code:
/> ls
o- / ......................................................................................................................... [...]
  o- backstores .............................................................................................................. [...]
  | o- block .................................................................................................. [Storage Objects: 1]
  | | o- zram .......................................................................... [/dev/zram1 (10.0GiB) write-thru activated]
  | |   o- alua ................................................................................................... [ALUA Groups: 1]
  | |     o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | o- fileio ................................................................................................. [Storage Objects: 0]
  | o- pscsi .................................................................................................. [Storage Objects: 0]
  | o- ramdisk ................................................................................................ [Storage Objects: 0]
  o- iscsi ............................................................................................................ [Targets: 1]
...

I don't know where to put tmpfs as it is not a block device, or a file. You can mount tmpfs in a directory, create a file and export the file over iscsi, but you lose dynamically grow and shrink.
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
crocket
Guru
Guru


Joined: 29 Apr 2017
Posts: 493

PostPosted: Mon Dec 16, 2019 11:45 am    Post subject: Reply with quote

How long does it take to compile chromium and libreoffice and gcc on raspberry pi 4?
Back to top
View user's profile Send private message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Tue Dec 17, 2019 12:57 pm    Post subject: Reply with quote

Libreoffice is acceptable 1:15h, but chromium just became a nightmare: without jumbo-build is 4:35h. Of course, using cross-compile. They just dropped jumbo-build, it was ~2h before. As a matter of fact, on main PC went from 1:30h to 3:30h.
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
crocket
Guru
Guru


Joined: 29 Apr 2017
Posts: 493

PostPosted: Tue Dec 17, 2019 1:38 pm    Post subject: Reply with quote

costel78 wrote:
Libreoffice is acceptable 1:15h, but chromium just became a nightmare: without jumbo-build is 4:35h. Of course, using cross-compile. They just dropped jumbo-build, it was ~2h before. As a matter of fact, on main PC went from 1:30h to 3:30h.


If you utilize cross-build, you don't really build on raspberry pi 4.
Back to top
View user's profile Send private message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Tue Dec 17, 2019 2:14 pm    Post subject: Reply with quote

Of course not. It is a matter of time and efficiency. And, for big packages, 4GB RAM is not enough, using swap is even slower.
Anyway, configure and link time eat time on Pi.
To convince me to compile only on Pi I would need, a SATA/M.2 port or, at least, a enclosure with TRIM support. I tried five already, 3 of them support TRIM on PC, but not on Pi, gave up, using a UASP Axagon M.2 sata enclosure.
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
crocket
Guru
Guru


Joined: 29 Apr 2017
Posts: 493

PostPosted: Tue Dec 17, 2019 9:43 pm    Post subject: Reply with quote

Is it possible to cross-compile sys-devel/gcc? It takes more than 12 hours to compile gcc through qemu-aarch64 on my desktop computer.

How can I prevent some packages from being compiled on qemu-aarch64 or raspberry pi?
Back to top
View user's profile Send private message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Wed Dec 18, 2019 5:52 pm    Post subject: Reply with quote

Well, I remember those questions :) not long time ago. Fortunately, Gentoo has a wonderful community, including NeddySeagoon - Thank you!

What I learn:
Qemu is the slowest path you can take. CPU instructions emulation is very expensive - my 6700K CPU is slower than Pi itself on qemu.
You may compile gcc using cross compile, only stage 1, or entirely (well, only one stage). As you probably knew, most of programming languages (not interpreted ones) are bootstrapped, this mean, at least, two stages, usually three.
First stage can be cross-compiled, the next can not, because they are using a local fresh built version of compiler.
gcc build system allow --disable-bootstrap https://gcc.gnu.org/install/configure.html.
This mean the compile consist on only one stage which can be cross-compiled, but at a expensive cost (time to compile and speed itself) of binaries which this gcc will generate, up to 25% in my experience.

Compile on Pi itself with iscsi zram for portage space and 4GB RAM on pi 4 takes ~5h, with boostrapping and pgo.
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
crocket
Guru
Guru


Joined: 29 Apr 2017
Posts: 493

PostPosted: Thu Dec 19, 2019 12:04 am    Post subject: Reply with quote

Raspberry Pi 3 B+ is not going to be faster than qemu-aarch64 chroot with AMD FX-8300 CPU. Plus, local RAM is a lot faster than zram.
Back to top
View user's profile Send private message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Thu Dec 19, 2019 8:22 am    Post subject: Reply with quote

Yes, Pi 3 is slower. I player with it, but started to use for work Pi 4.
For majority of packages I use local zram, of course, iscsi only for those that require more than 3-4gb space to compile.
Overall, don't trust quemu - it generates wrong binaries sometimes.

Later edit: I am sorry, tried to answer from my phone, but phpBB2 it is showing it's age.
So, if you are still using Pi 3, what stop you to compile on Pi 4 for it ? It would be faster, just set march flag.
CPU speaking the main difference is that Pi 4 is using an out of order CPU, that's why it is a lot faster.
Sakaki's gentoo is optimized for 4, but is working for 3, also.
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
crocket
Guru
Guru


Joined: 29 Apr 2017
Posts: 493

PostPosted: Thu Dec 19, 2019 8:54 am    Post subject: Reply with quote

Today, I fried my Raspberry Pi 3 B+ by measuring AC voltage difference between protective earth and GND GPIO pin on Raspberry Pi 3 B+.
I also measured AC voltage difference between protective earth and nearby GPIO pins.
My Raspberry Pi 3 B+ became a brick.

Perhaps, God is telling me to stop using single board computers.
Back to top
View user's profile Send private message
costel78
Guru
Guru


Joined: 20 Apr 2007
Posts: 373

PostPosted: Thu Dec 19, 2019 11:00 am    Post subject: Reply with quote

I am sorry to hear that. Fortunately, it is easy replaceable, with a better model.
Single board computers are not easy to use, require a lot of patience. I use them for niche task: TV showing info, warehouse stock client-app etc. What I am trying to say is that we have been trapped with flexibility of gentoo, but, also, with it long and tedious compiling so it is vital to set our expectations when we are using those entry-level arm boards.
_________________
Sorry for my English. I'm still learning this language.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo on ARM All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum