btmnt -w
cd /stand
cp unix unix.good
cd
btmnt -d
We're going to start with an actual case. A local consultant called
me because he had tried to increase a kernel variable, but the link
failed. The increase was critical to the proper functioning of the
system, and he couldn't fix it.
As it turns out, I could have identified the problem in seconds. Unfortunately,
I didn't realize that at the time (live and learn), but even if I
had thought of that method, I would have dismissed it because I was
sure the problem was elsewhere. I'll tell you what I should have done
that would have instantly told me what was wrong, but I'll hold off
explaining why until later. Here's what would have given me the answer
I needed:
cd /etc/conf/cf.d
diff sdevice sdevice.new
Think about that as you read along.
This article doesn't go into the whole subject of drivers and the
link directories very deeply. You might want to read Understanding
Device Drivers if you want to understand more.
The first thing I did was this:
cd /etc/conf/cf.d
script /tmp/linkerr
./link_unix
After the script finished belching out its errors, I used CTRL-D to
exit "script", and went to look at /tmp/linkerr. Here it is:
# ./link_unix
The UNIX Operating System will now be rebuilt.
This will take a few minutes. Please wait.
Root for this system build is /
undefined first referenced
symbol in file
putctl /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
sdistributed /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
freemsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
qreply /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
flushq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
putq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
qsize /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
getq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
putbq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
allocb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
linkb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
copyb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
dupb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
freeb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
canput /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
putnext /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
putctl1 /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ptm/Driver.o
qenable /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ptm/Driver.o
bufcall /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ldterm/Driver.o
pullupmsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/timod/Driver.o
copymsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/timod/Driver.o
msgdsize /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o
unlinkb /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o
rmvq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o
insq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o
lock_stp /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
backq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
unlock_stp /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
qdetach /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
at_qrunflag /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
strwaitbuf /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
dupmsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
lock_str_bfsleep /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
strmaxblk /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
getclass /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
allocq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
streams /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
freeq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
setq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
shlock_str_qnext /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
clnopen /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
noenable /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
qdisable /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
strdoioctl /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o
strwaitq /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o
findmod /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o
qattach /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o
strqset /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ip/Driver.o
adjmsg /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ip/Driver.o
strmsgsz /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/rip/Driver.o
unbufcall /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/iknt/Driver.o
bsize /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/iknt/Driver.o
esballoc /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/net0/Driver.o
mblock /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ipl/Driver.o
emblock /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ipl/Driver.o
rbsize /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/nfs/Driver.o
i386ld fatal: Symbol referencing errors. No output written to unix
ERROR: Can not link-edit unix
idbuild: idmkunix had errors.
System build failed.
#
Pretty awful mess, isn't it? I was convinced that a driver file in
/etc/con/pack.d must be missing or horribly corrupted. Actually, though,
it couldn't have been a missing driver file- the link_unix would have
reported that in plain English. A really badly corrupted driver file
would have also barfed differently, though the error message wouldn't
be as obvious (I'll show examples of that later).Could it be that
a good driver had been copied incorrectly- for example somehow copying
/etc/conf/pack.d/clone/Driver.o to /etc/conf/pack.d/kbd ? No, because
that would give us multiply defined symbols, and there's no mention
of that in the output.
How about a Driver.o from a different release, or from a backup prior
to the application of patches? Yes, that could cause these kind of
errors, and that was my first thought. Yet, I know the local consultant
pretty well, and that doesn't sound like something he would have done,
even accidentally, so I gave up that and decided that some needed
driver was just not being linked into the kernel. Now to find it.
I picked a symbol from the list of errors and went looking for it
like this:
cd /etc/conf/pack.d
for i in */Driver.o
do
strings $i | grep esballoc && echo $i
done
Let me say right away: that's NOT the best way to look for symbols
in a .o file, but I got lucky and "str" popped up as a match. I checked
/etc/conf/sdevice.d/str, and it was marked N:
str N 0 0 0 0 0 0 0 0
Now that's pretty odd: it shouldn't have been: "str" is the Streams
driver and is necessary for just about everything on the network.
I changed it to "Y" and tried the link again:
# ./link_unix
The UNIX Operating System will now be rebuilt.
This will take a few minutes. Please wait.
Root for this system build is /
undefined first referenced
symbol in file
clnopen /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
i386ld fatal: Symbol referencing errors. No output written to unix
ERROR: Can not link-edit unix
idbuild: idmkunix had errors.
System build failed.
That's better; a lot less errors, but still no success. When you are
linking a kernel, even one error is one too many. So I tried my script
again, but with clnopen this time:
cd /etc/conf/pack.d
for i in */Driver.o
do
strings $i | grep clnopen >> echo $i
done
This didn't work, though. It's not that "clnopen" isn't somewhere
in one of those Driver.o files, it's that "strings" isn't good enough
to find it. However, I had other weapons: I was dialed in to the customer,
but was working from my own machine which happens to be the same OS
release. On my machine, I have the Development System installed, and
the Development System has "nm". So on my system I did this:
cd /etc/conf/pack.d
for i in */Driver.o
do
nm $i | grep clnopen >> echo $i
done
Bingo! The "clone" driver has "clnopen", and sure enough, it too was
turned off in /etc/conf/sdevice.d (nobody knows how or why this happened,
by the way). I turned it back on, and now the kernel linked successfully.
If I had not had "nm", I could have done this:
cd /etc/conf/pack.d
for i in */Driver.o
do
hd $i | grep clnopen && echo $i
done
As I said at the outset, if I had done a diff on the two sdevice files,
this would have shown me:
60c60
< clone Y 1 0 0 0 0 0 0 0
---
> clone N 1 0 0 0 0 0 0 0
319c319
< str Y 0 0 0 0 0 0 0 0
---
> str N 0 0 0 0 0 0 0 0
The reason that works is that link_unix apparently doesn't replace
sdevice until the link is successful (sdevice is built from the individiual
files in /etc/conf/sdevice.d). That's very helpful for this kind of
error, because it immediately shows you what has changed since the
last successful link.
Linux server folder size limit making my life difficult
the webserver at my work is linux and was already linux when i got there.
it sounds like it is good that it is linux as most people believe linux is a more stable environment for webserving - which i have no knowledge of what-so-ever!! ...
|
|
|
Other Linking Errors
Of course, there are other things that can go wrong. One I see now
and then is where a new device has been partially installed or partially
removed, and the kernel fails to link because enough of it is still
there to confuse it. In a case like this, you want to look in /etc/conf/cf.d/mdevice,
and the offending device will probably be at the end of it. If you
are not really sure, you can just comment out the line you think is
the problem by putting a "#" at the beginning of the line; if the
kernel then relinks, that was it. For example, here's the end of my
mdevice; the E3H was the last thing I added to this machine:
vdsp ocriI ioc vdsp 0 126 0 0 -1
vgic ociI ioc vgic 0 127 1 1 -1
vkbd ocwiI ioc vkbd 0 128 0 0 -1
vmouse ociI ioc vmse 0 129 1 1 -1
vw I icS vw 0 130 8 128 -1
net0 I iSc net0 0 131 1 256 -1
e3E I icSH e3e 0 132 0 1 -1
ipl Iocir ico ipl 0 133 1 1 -1
net1 - iSc net1 0 134 1 256 -1
e3H I icSH e3H 0 135 0 1 -1
Corruption
What about a corrupted driver? The errors you get will depend upon
the nature of the corruption, but let's try some experiments (if you
aren't comfortable and sure of yourself, don't try this on a working
machine):
cd /etc/conf/pack.d/str
mv Driver.o Safe
date > Driver.o
cd /etc/conf/cf.d/
./link_unix
When I did this, I got a message saying that the file "Wed" (it happened
to be Wednesday) couldn't be opened for input. Let's try something
else:
cd /etc/conf/pack.d/str
cp /bin/ls Driver.o
cd /etc/conf/cf.d/
./link_unix
This time I got a message complaining that it couldn't open "file
ELF". That would be a very definite sign of corruption: Driver files
would always be "COFF".
To put everything back as it was:
cd /etc/conf/pack.d/str
rm Driver.o
mv Safe Driver.o
I hope this gives you a little more confidence should you ever run
into a broken kernel relink. Certainly other errors are possible,
but these are the most common I've seen.
Originally appeared at http://www.aplawrence.com(http://www.aplawrence.com/Unixart/linkfail.html)
Please
Read This Disclaimer
Copyright
and Reprint Info
About the Author:
A.P. Lawrence provides SCO Unix and Linux consulting services http://www.pcunix.com
Read this newsletter at: http://www.linuxpronews.com/2003/0917.html |
|



|