Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Chroot broken on arm64
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo on ARM
View previous topic :: View next topic  
Author Message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 47055
Location: 56N 3W

PostPosted: Thu Sep 28, 2017 10:12 am    Post subject: Chroot broken on arm64 Reply with quote

Team,

I got may arm64 install into a state where it either won't boot because it hangs when it should be making static dev nodes, or it won't start mate or xfre4 due to a missing symbol.
So I've tried to chroot in from an older install as follows.

Code:
Pi3 64bit ~ # mount /dev/sdb1 /mnt/sdroot
Pi3 64bit ~ # mount --rbind /run /mnt/sdroot/run
Pi3 64bit ~ # mount --rbind /sys /mnt/sdroot/sys
Pi3 64bit ~ # mount --rbind /dev /mnt/sdroot/dev
Pi3 64bit ~ # mount -t proc proc  /mnt/sdroot/proc
Pi3 64bit ~ # chroot /mnt/sdroot /bin/bash

"If you'll excuse me a minute, I'm going to have a cup of coffee."
- broadcast from Apollo 11's LEM, "Eagle", to Johnson Space Center, Houston
  July 20, 1969, 7:27 P.M.

Pi3 64bit / # env-update

[1]+  Stopped                 chroot /mnt/sdroot /bin/bash
Pi3 64bit ~ #
That all looks good. /etc/bash/bashrc runs fortune as the very last thing.
Then in the chroot, the env-update command stops chroot.

Its like there is no job control somewhere.

Code:
chroot /mnt/sdroot /bin/busybox sh
works as busybox is built statically but lots of things don't work with busybox as the shell.

Any pointers?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
LIsLinuxIsSogood
Veteran
Veteran


Joined: 13 Feb 2016
Posts: 1175

PostPosted: Fri Sep 29, 2017 9:46 am    Post subject: Reply with quote

Maybe boot from sysrescueCD (assuming that you haven't already tried that?)
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 47055
Location: 56N 3W

PostPosted: Fri Sep 29, 2017 10:06 am    Post subject: Reply with quote

LIsLinuxIsSogood,

Its arm64, not amd64. It will be a year or two before System Rescue CD supports that arch.
The system is a Raspberry Pi 3 in 64 bit mode. Its all a bit like the wild west :)

I've tried booting from an image of the broken install as it was 9 months ago. That boots but won't chroot.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 196

PostPosted: Fri Sep 29, 2017 5:50 pm    Post subject: Reply with quote

You've helped so many people here including myself and so there's a natural desire to return the favor but all I got is maybe it's a permissions problem? Something you no doubt dismissed in the first few seconds of considering the problem.

Usually with chroots that where things go awry in my experience.
_________________
Today is the first day of the rest of your Gentoo installation.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 47055
Location: 56N 3W

PostPosted: Fri Sep 29, 2017 8:16 pm    Post subject: Reply with quote

nokilli,

All help gratefully received. I think its probably something silly that I'm doing.

This may not be related.
The system I'm trying to chroot into won't boot on its own. It stalls when it should be making static device nodes for the kernel.
If I downgrade glibc (thats a silly thing to do - don't do that at home) while portage isn't looking. It boots.
I've not tested the chroot with an older glibc.

Downgrading glibc prevents Xfce4 and/or Mate from working, so that's not an option but it does have the advantage that it boots :)

Rule One is assume nothing.
Feel free to ask anything you consider relevant.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 18718

PostPosted: Sat Sep 30, 2017 3:45 am    Post subject: Reply with quote

OK, so a couple of questions mainly for my own benefit, but who knows, they may trigger an idea.

Keep in mind that I have little more than an ethereal concept that glibc provides "core libraries."

Quote:
It stalls when it should be making static device nodes for the kernel.
Is there a significant "transition" going on at the point where it should be creating the nodes? And more specifically, what code is it using at that point when it crashes? A new or previously rarely encountered bug in either code base might be the issue.

Since downgrading glibc results in solving the boot problem, I'm wondering if a different kernel would also solve the problem. If so, then perhaps finding a kernel that doesn't have the problem could narrow down the code change which led to the crash. Or maybe the same if the difference is in glibc code. If creating device nodes uses a smallish section of code, that would help narrow down the focus of the code search.

At least it sounds good from where I'm sitting (aka, no clue) :)
_________________
Magna Carta (1215) | Spectral evidence no longer permissible (c. 1792) | Cancel culture, deplatforming (c. 2016)
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 47055
Location: 56N 3W

PostPosted: Sat Sep 30, 2017 9:16 am    Post subject: Reply with quote

pjp,

My first though, was how can I determine that?
After a little pondering, I can fill the init scripts with echo commands. That should tell me what was executed last.

I did try the Interactive boot mode but I have to say "No" to services so early, that its no better than init=/bin/bash.
init=/bin/bash works as well as you would expect it to.

-- edit --

After poking about with lots of echo statements, it appears that
/etc/inittab:
# System initialization, mount local filesystems, etc.
si::sysinit:/sbin/openrc sysinit

# Further system initialization, brings up the boot runlevel.
rc::bootwait:/sbin/openrc boot
...
l3:3:wait:/sbin/openrc default
...


The bootwait and wait statements both wait forever. Everything in the sysinit runlevel appears to run.
Nothing in the boot runlevel ever starts.

-- edit some more --

The devfs service completes but is in the
Code:
ps -Alf
output 10 seconds later, so it appears that the parent never finds out.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 18718

PostPosted: Sun Oct 01, 2017 2:21 am    Post subject: Reply with quote

Well, my perhaps overly unrealistic thought was doing something similar in either the kernel or glibc code. But I may have misremembered whether or not you did development. My thinking was that when the "create_device_node()" function was run, something in that was messed up.

NeddySeagoon wrote:
The bootwait and wait statements both wait forever. Everything in the sysinit runlevel appears to run.
Nothing in the boot runlevel ever starts.
So is it openrc that's creating the device nodes? I literally have no idea how or where that happens. Is the init system creating the devices which are then visible to the kernel?

NeddySeagoon wrote:
The devfs service completes but is in the
Code:
ps -Alf

output 10 seconds later, so it appears that the parent never finds out.
It is completing but not creating the devices?

And while going back to try ensuring I wasn't asking something completely unhelpful, I noticed your opening statement:
NeddySeagoon wrote:
I got may arm64 install into a state where it either won't boot
Do you recall what you did to get it in this state? This made it sound like it was just fine until you got an idea to experiment :)

The only other thing I could think of is whether or not the combination of the kernel, glibc and openrc are working on a different architecture (or the relevant pieces if they aren't all relevant).
_________________
Magna Carta (1215) | Spectral evidence no longer permissible (c. 1792) | Cancel culture, deplatforming (c. 2016)
Back to top
View user's profile Send private message
vaxbrat
l33t
l33t


Joined: 05 Oct 2005
Posts: 731
Location: DC Burbs

PostPosted: Sun Oct 01, 2017 2:40 am    Post subject: Missing crucial node(s) in the static /dev tree maybe? Reply with quote

Could you be missing something in the static /dev tree that's there in the initial filesystem? That would either be the initramfs or the root one itself if you don't bother to use one and then pivot.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 47055
Location: 56N 3W

PostPosted: Sun Oct 01, 2017 9:25 am    Post subject: Reply with quote

pjp, vaxbrat,

Responses intermingled.

I got into this mess by doing an update. The update included glibc. After that build, I noticed sandbox errors about couldn't bind some signals.
From memory, signals 3, 10 and 15. I can do the glibc downgrade, boot and glibc upgrade if exact numbers are important.
It may even be in some of the build logs.

Code:
>>> sys-libs/glibc-2.25-r4 merged.
>>> Regenerating /etc/ld.so.cache...
sandbox:main  unable to bind signal 3: Bad file descriptor

sandbox:main  signal 15 already had a handler ...
sandbox:main  unable to bind signal 10: Bad file descriptor

The entire build log is at http://bpaste.net/show/96eed28e20b0

Seeing that glibc had just been built and there were apparent errors, I rebooted, which didn't work.
I tried the emergency manual override to downgrade glibc.
Don't do this at home!:
tar --xattrs -xpvf  packages/sys-libs/glibc-2.24-r3.tbz2 -C  /mnt/floppy

Which allowed the system to boot, for some versions of glibc.

I've reverted openrc the same way
Code:
tar --xattrs -xpf openrc-0.22.4.tbz2 -C /mnt/floppy/
thinking that booting might be an openrc issue.

I don't think I'm missing static /dev nodes. devtmpfs is mounted on /dev so everything should be there.
The only user is root, so it should not be a permissions issue in /dev either.

-- edit --

Bug sys-libs/glibc-2.25-r5 fails with binutils-2.29: segfault running simple test during pkg_preinst on arm64
might be relevant.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 18718

PostPosted: Mon Oct 02, 2017 4:32 am    Post subject: Reply with quote

Unfortunately it was well beyond me at the beginning. What seems strange is that xfce stopped working with the downgrade. And I'm guessing the downgrade was to versions on which xfce4 previously worked?

Hopefully the bug is promising. Have you tried the suggestion of disabling stack protection?
_________________
Magna Carta (1215) | Spectral evidence no longer permissible (c. 1792) | Cancel culture, deplatforming (c. 2016)
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 47055
Location: 56N 3W

PostPosted: Mon Oct 02, 2017 9:14 am    Post subject: Reply with quote

Team,

Its actually several problems. I'm unravelling it a bit at a time.

Downgrading glibc to 2.24-r3 allows booting to work.

The updated broken glibc-2.25.x provides a new symbol that libbsd-0.8.6 depends on.
Downgrading glibc to 2.24-r3 and libbsd to 0.8.5 allows both Xfce4 and Mate to work.

Strictly speaking the libbsd-0.8.6.ebuild should have an RDEPEND on >=glibc-2.25
but we don't tend to do that as glibc is in the system set and just works.
Its only ~arch users and beyond that will get caught out, then only when they downgrade glibc,
which is, according the the error message, "a sure way to break your system".

I've built glibc with disabling stack protection, as per the bug but it didn't install due
to file collisions. I'll install the tarball and see what happens.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
mDup
Apprentice
Apprentice


Joined: 14 Apr 2006
Posts: 202

PostPosted: Tue Oct 03, 2017 2:29 pm    Post subject: Re: Chroot broken on arm64 Reply with quote

NeddySeagoon wrote:

Code:
chroot /mnt/sdroot /bin/busybox sh
works as busybox is built statically but lots of things don't work with busybox as the shell.
Any pointers?

If I run
Code:
/bin/busybox ash

or
Code:
/bin/busybox sh

on a working system everything seems to work.
So which things don't work?

Note that
Code:
/sbin/ldconfig
and
Code:
/sbin/sln
are usually statically linked too.

Maybe it helps to run ldconfig?
Or to fix some links manually with sln?

Sorry to not be of any help.
Good luck!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo on ARM All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum