Post by Valts Mazurs
On Mon, 15 Jan 2007 11:54:46 +0200
Post by Peter Nixon Post by Valts Mazurs
And what if authorization requests had higher priority than the accounting
This is possible in FreeRADIUS. I run separate sql modules for
accounting and auth with different DB usernames. You may then assign
different numbers of sockets to them, fail them over differently (even
use sql_log for accounting if you wish) and assign different priorities
to the queries on the database side (if your DB supports such options),
or different physical machines if you wish.. (Auth queries can run on a
read only slave mirror of the master DB for example)
What I mean is that I would like to process all authorization requests
before accounting requests. That means if there are any auth requests
in queue, working thread takes the youngest auth request. The logic is
that it is more reasonable to give answer to the youngest auth request
and deliver the answer in time. Older auth requests might be already
unworthy and there is no point of processing them.
I suspect that this give similars result as the reply caching in FreeRADIUS
("cleanup_delay" etc). What happens to "older" requests that are still in
the queue? Do you expire them after some time? Reply to them eventiually or
keep track of duplicates?
Post by Valts Mazurs
Accounting requests may wait into the queue until they are processed.
Post by Peter Nixon Post by Valts Mazurs
And what if accounting response could be sent even before processing
This is possible in FreeRADIUS.
1) incoming acct request
2) send acct response
3) pass request for modules to process
4) see if processing was ok
5) log the request to special file if processing failed
If it is possible in FreeRADIUS I will consider it as my fault that I
have not found this functionality by myself.
ok. Basically the way you handle this with FreeRADIUS is:
* you use the sql (or ldap) module to directly service Authentication
requests. All Authentication requests are handled as fast as the backend can
* you pass all accounting requests to either the sqllog module OR a specially
configured detail module. (I prefer the detail method, and I am going
describe that, but the concept and result is effectively the same.)
* The detail module writes the accounting request to an on disk spool (on a
big system this should be a dedicated disk so that speed is not effected by
other system utilization) and sends accounting response.
* A second copy of FreeRADIUS configured in "radrelay" mode lazily reads this
spool from disk and processes it into your accounting database. (As I
mentioned in a previous email, your accounting database may be a Master
while your Auth database is a read only slave if you wish to split up the
load between multiple machines)
This is superior to your method in that it is not possible to lose an
accounting record after having already sent an accounting response. If your
radius server crashes/disappears your NASes will think that the accounting
packet has been logged and forget about it, mine will resend it to my
secondary RADIUS server and the packet will not be lost.
While there IS a difference in speed between writing to an on-disk spool and
your method (in memory queue) our method is "correct" and a dedicated disk
(or raid set) is more than fast enough to keep up with thousands of requests
per second. (I haven't benchmarked it recently but I suspect we are in 100K
requests per second territory here depending on disk spindle speed,
filesystem and cache configuration)
This could be possibly improved by adding an in memory authentication queue
to make sure that radiusd never tries to send more than the maximum number
of sql sockets to the database backend at one time, however this only buys
you a small advantage (given that we are already queuing acct) in that if
the request is delayed for more than 1 second (this is typically
configurable on the NAS of course) the NAS will resend the Auth request to
the secondary RADIUS. This results in duplicated load on the backend
(assuming its shared, although it could be a second slave database) which
makes the problem even worse.
Basically a queue of more than a second (or the timeout configured on your
NAS) is worse than sending an Authentication reject to a couple of users as
the whole thing just snowballs! An Auth queue only helps in the case where
you have a huge peak of requests that cannot be serviced simultaneously but
CAN be serviced quicker than the configurable timeout of your NAS. If you
continually have a deep queue then you need to increase the speed of your
Post by Valts Mazurs Post by Peter Nixon Post by Valts Mazurs
In VoIP world 50 requests per second is nothing BIG. Users begin
and terminate their sessions very often. The main reason is bad
link quality which results in many unsuccessful call attempts.
Yes. You are correct. I have happily put 1000 requests per second
through FreeRADIUS. It all depends on a properly setup backend..
I'm not telling that FreeRADIUS would be slow. It really is not.
Simply... there are ways how to help slow backends appear as not so
slow for end user.
Yes. There are _some_ tricks you can play to try and deal with restart
conditions (When one or more NAS reset a group of interfaces due to an error
condition or reboot completely) more effectively. In the end however if your
Authentication backend cannot clear the load backlog within a second or so
you are lost anyway. Accounting of course can be handled lazily, as I
I would be interested to see you run a benchmark to show that your algorithm
for dealing with newest Auth request first is actually a performance gain in
a high load environment. (I suspect that it will make very little difference
compared with out caching system) If it does make a considerable difference,
then of course we would consider adding a similar feature to FreeRADIUS.
(ie. Please prove to us that what you have done is actually better!)
Regarding your accounting optimization, it would be trivial for us to do the
same thing but it would be wrong as it introduces the possibility of lost
packets. (While there ARE companies out there that can enough accounting
packets to saturate a single disk, they can afford to purchase a raid set,
and they DO care about doing everything possible to not losing data!)
PGP Key: http://www.peternixon.net/public.asc