Lost Input Channel From IDENT:1000@localhost

The Problem

[Linux]

28th August 2003

Several weeks ago, I was trying to get PHP Nuke working, especially the mailout for new user confirmations. In my fiddling to make that work, I did something and screwed up my mail setup.

I wasn't sure what I had done. Yeah, I know, I should have written down everything in my notebook so I could backtrack. But I have lots of excuses why I didn't do that this time. Yeah, the one time I don't make notes is the one time I screw it up.

This is my email setup:

  • Every 5 minutes, fetchmail checks the mail server and brings down any mail waiting for me.
  • It hands the mail to port 25 on localhost, which happens to be sendmail.
  • Sendmail dumps it in my mail queue.
  • I fire up mutt, mutt reads the mail queue, and I read my email.

I noticed the problem when I hadn't seen any mail for an hour. I checked fetchmail manually. It seemed to hang. I used ps and there were half a dozen fetchmails hanging there. I killed them all off, and ran fetchmail manually with extra logging: "fetchmail -v -v". That showed where the problem was.

fetchmail would successfully contact the mail server, get the list of emails waiting for me, then bring the first one down. It opened a socket connection to port 25 on localhost and went through the usual SMTP to attempt to deliver the email to sendmail. While sending the SMTP envelope, the problem occurred. When "MAIL To: Henry Griggs <hgriggs@hgriggs.com>" was sent, fetchmail just waited and waited and eventually timed out, gave errors about handing off to the MTA, left the message on the mail server and shut down.

Searching for a Solution

I checked sendmail and did some test outbound messages and they worked fine. Did a manual connection on port 25 and got the same result. Looked in /var/log/mailbox and saw this line:

Aug 11 19:28:45 henry sm-mta[267]: h7BNIhGb000267: lost input channel from IDENT:1000@localhost [127.0.0.1] to MTA after rcpt

So I did heaps of Google searches looking for an answer. I found heaps of people asking the same question, and the only answer that was found was that this happens when sendmail does an ident call to check the validity of the sender, and it's a faked from address, usually from spammers. That's not me.

I tried a number of different things and got nowhere. In desperation I switched to using the Mozilla email. That worked, it let me keep going, but it's not the most satisfactory solution for my needs. I do a lot of remote work and I ssh into my box and out of my box and I'm all over the place. Using mutt allows me to do email quickly and easily, even over slow links and from boxes without X. Mozilla helped me out, but it made it awkward for me. I could still use mutt and switch to the mail files that Mozilla uses (yay for open standards and common mailbox formats) but I couldn't do the sort of searches and sorting and filtering that I was used to.

Anyway, as I had email back, sort of, I didn't work hard to fix the problem. Every now and then I would come back and experiment some more. I checked that identd was working, and that inetd was properly spawning it when requests were made. I could see heaps of valid ident results in /var/spool/messages when the web server did ident calls. I checked the sendmail.cf and submit.cf configuration files. I recreated them. I checked the new versions. I even changed the ident timeout in submit.cf so that sendmail would not do an ident lookup. It still failed with the same "lost input channel from IDENT" error. Everything else checked out. I checked my /etc/mail/aliases; I checked resolv.conf and host.conf and hosts. I followed some suggestions and made sure that hosts had both localhost and localhost.localdomain. Everything checked out. I had two other Slackware boxes with identical installation to compare against. They worked, this one didn't.

I even decided to try the Windows approach. I used Slackware's package management tools and removed sendmail completely, and reinstalled it. Same result. Downloaded the latest sendmail sources, compiled it and installed it. Same result. I installed an earlier version of sendmail that doesn't split the smmsp and sm-mta. 8.11.7. Same result. Blew it away, reinstall latest sendmail from scratch. Same result.

Then I tried the real Windows approach. I backed up, formatted my system and reinstalled from scratch. Same result. What was going on? This was insane.

Eventually, I got so angry I sat down and checked every single thing, every single step in the path. I checked the logs of fetchmail, and everything it did. I watched sendmail as it did its thing. I checked the sendmail source for what caused the "lost input channel from IDENT" error. There's three or four causes for the error. My problem appeared to be coming after identd returned a valid result, and sendmail went to do something else but got an error, and it all got reported as if identd was the problem. So I went through the sendmail installation notes, and checked everything to do with my sendmail installation, and I found the problem. I don't know how it happened, I don't know how it survived a complete reinstall.

The Solution

My mail spool file had incorrect ownership. /var/spool/mail/hgriggs had group users and not group mail. Changing the group back to mail fixed the problem completely. So after sendmail did the ident and got back a correct answer, it went to open my mail spool file, got an open error because of ownership permission problems, and reported back as if it was still an IDENT problem. But that doesn't explain the long timeout. Does it retry many times? was it really identd trying to do things with the mail spool file? I'm not sure which one now, but I know how to fix the problem.

How did it get the wrong group ownership? I don't know. More importantly, how did it get the wrong group ownership after a full reinstall of Slackware? Again, I don't know. But it did. And that's what caused me all the grief. Setting correct group ownership solved the entire problem immediately.

A long time ago, I read in Æleen Frisch's book Essential System Administration that 99% of Unix problems are permission problems. Which includes ownership problems. Maybe I mis-remember the percentage value, but over the years that saying has proved itself again and again. Permissions and ownership prove to be the source of many of my problems. Maybe one day I'll stop causing them. But every time I find one problem and fix it, I gain a lot more knowledge. Tracking this error down gave me a lot of knowledge that I am grateful for. It makes it easier to track down other problems.