0015001 /* this file contains the interface of the network software with the file
0015002 * system.
0015003 *
0015004 * Copyright 1995 Philip Homburg
0015005 *
messages from the FS
In order to communicate with one another, the services (the file system (FS), the memory manager (MM), and the network service), the kernel, and the user processes send messages back and forth. These messages take the following form:
typedef struct {
int m_source; /* who sent the message */
int m_type; /* what kind of message is it */
union {
mess_1 m_m1;
mess_2 m_m2;
mess_3 m_m3;
mess_4 m_m4;
mess_5 m_m5;
mess_6 m_m6;
} m_u;
} message;
typedef struct {int m1i1, m1i2, m1i3; char *m1p1, *m1p2, *m1p3;} mess_1;
typedef struct {int m2i1, m2i2, m2i3; long m2l1, m2l2; char *m2p1;} mess_2;
typedef struct {int m3i1, m3i2; char *m3p1; char m3ca1[M3_STRING];} mess_3;
typedef struct {long m4l1, m4l2, m4l3, m4l4, m4l5;} mess_4;
typedef struct {char m5c1, m5c2; int m5i1, m5i2; long m5l1, m5l2, m5l3;}mess_5;
typedef struct {int m6i1, m6i2, m6i3; long m6l1; sighandler_t m6f1;} mess_6;
(In the names of the fields above, "i" stands for integer, "l" stands for long, and "p" stands for pointer.)
The message types (i.e., m_type) that the FS sends will be DEV_OPEN, DEV_CLOSE, DEV_READ, DEV_WRITE, DEV_IOCTL3, and DEV_CANCEL. For example, if a user process opens the /dev/udp file, the FS sends a message of type "DEV_OPEN" to the network service. The type of message that the network service always sends the FS is DEVICE_REPLY.
Recall that only a single field within a union is valid. When the file system (FS) receives a request (open, read, write, etc.) for one of the device files associated with the network service (/dev/udp, /dev/tcp, /dev/ip, etc.), the FS sends a message with an m_u of mess_2 to the network service. Repeated from above, the mess_2 typedef is declared as follows:
typedef struct {int m2i1, m2i2, m2i3; long m2l1, m2l2; char *m2p1;} mess_2;
m_source m_type DEVICE (m2_i1) PROC_NR (m2_i2) COUNT or REQUEST (m2_i3) POSITION (m2_l1) TTY_FLAGS (m2_l2) ADDRESS (m2_p1)
Note: The field POSITION is marked gray because it is not used here and TTY_FLAGS is marked gray because it is always false. This denotes that it is in blocking mode (in other words, the process is waiting for a response).
The meaning of each of these fields is as follows:
DEVICE is the minor number of the device.
PROC_NR is the process number of the client.
COUNT is the number of bytes to transfer.
REQUEST is specific to ioctl operations. An example of an ioctl operation is NWIOSUDPOPT (NetWork IO Set UDP OPTions).
POSITION is the position in the device file.
TTY_FLAGS is zero (blocking mode).
ADDRESS is a pointer to a buffer. This buffer will either be read from or written to by the network service.
It is interesting to compare the fields above with the parameters for dev_io (from the file system):
PUBLIC int dev_io(op, nonblock, dev, pos, bytes, proc, buff)
int op; /* DEV_READ, DEV_WRITE, DEV_IOCTL, etc. */
int nonblock; /* TRUE if nonblocking op */
dev_t dev; /* major-minor device number */
off_t pos; /* byte position */
int bytes; /* how many bytes to transfer */
int proc; /* in whose address space is buff? */
char *buff; /* virtual address of the buffer */
The format of reply messages is different:
m_source DEVICE_REPLY proc nr status fd operation unused unused
The message queue is a linked list of mq_t struct's:
typedef struct mq
{
message mq_mess;
struct mq *mq_next;
int mq_allocated;
} mq_t;
Immediately after initialization, the message queue is as follows:
In preparation for receiving a message, mq_get() is called to remove one of the mq_t structs in the linked list from the free list of messages. This link will hold the next message received and will be passed to either the code that handles the messages from the file system, the asynchronous alarm task, or the ethernet task.
0015006 * The valid messages and their parameters are:
0015007 *
0015008 * Requests:
0015009 *
0015010 * m_type NDEV_MINOR NDEV_PROC NDEV_REF NDEV_MODE
0015011 * -------------------------------------------------------------
0015012 * | DEV_OPEN |minor dev | proc nr | fd | mode |
0015013 * |-------------+-----------+-----------+-----------+----------+
0015014 * | DEV_CLOSE |minor dev | proc nr | fd | |
0015015 * |-------------+-----------+-----------+-----------+----------+
0015016 *
0015017 * m_type NDEV_MINOR NDEV_PROC NDEV_REF NDEV_COUNT NDEV_BUFFER
0015018 * ---------------------------------------------------------------------------
0015019 * | DEV_READ |minor dev | proc nr | fd | count | buf ptr |
0015020 * |-------------+-----------+-----------+-----------+-----------+-----------|
0015021 * | DEV_WRITE |minor dev | proc nr | fd | count | buf ptr |
0015022 * |-------------+-----------+-----------+-----------+-----------+-----------|
0015023 *
0015024 * m_type NDEV_MINOR NDEV_PROC NDEV_REF NDEV_IOCTL NDEV_BUFFER
0015025 * ---------------------------------------------------------------------------
0015026 * | DEV_IOCTL3 |minor dev | proc nr | fd | command | buf ptr |
0015027 * |-------------+-----------+-----------+-----------+-----------+-----------|
0015028 *
0015029 * m_type NDEV_MINOR NDEV_PROC NDEV_REF NDEV_OPERATION
0015030 * -------------------------------------------------------------------|
0015031 * | DEV_CANCEL |minor dev | proc nr | fd | which operation|
0015032 * |-------------+-----------+-----------+-----------+----------------|
0015033 *
0015034 * Replies:
0015035 *
0015036 * m_type REP_PROC_NR REP_STATUS REP_REF REP_OPERATION
0015037 * ----------------------------------------------------------------------|
0015038 * | DEVICE_REPLY | proc nr | status | fd | which operation |
0015039 * |--------------+-------------+------------+---------+-----------------|
0015040 */
0015041
0015042 #include "inet.h"
0015043
0015044 #include <minix/callnr.h>
0015045
0015046 #include "mq.h"
0015047 #include "proto.h"
0015048 #include "generic/type.h"
0015049
0015050 #include "generic/assert.h"
0015051 #include "generic/buf.h"
0015052 #include "generic/sr.h"
0015053
0015054 THIS_FILE
0015055
0015056 #define FD_NR (16*IP_PORT_MAX)
IP_PORT_MAX is the maximum number of network devices (4 for 386) and is defined in inet/inet_config.h. There can be 2 ethernet network devices and two psip network devices.
0015057
0015058 typedef struct sr_fd
0015059 {
0015060 int srf_flags;
0015061 int srf_fd;
0015062 int srf_port;
0015063 sr_open_t srf_open;
0015064 sr_close_t srf_close;
0015065 sr_write_t srf_write;
0015066 sr_read_t srf_read;
0015067 sr_ioctl_t srf_ioctl;
0015068 sr_cancel_t srf_cancel;
0015069 mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
0015070 mq_t *srf_read_q, *srf_read_q_tail;
0015071 mq_t *srf_write_q, *srf_write_q_tail;
0015072 } sr_fd_t;
sr_fd / sr_fd_table[] / sr_add_minor()
One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):
typedef struct sr_fd
{
int srf_flags;
int srf_fd;
int srf_port;
sr_open_t srf_open;
sr_close_t srf_close;
sr_write_t srf_write;
sr_read_t srf_read;
sr_ioctl_t srf_ioctl;
sr_cancel_t srf_cancel;
mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
mq_t *srf_read_q, *srf_read_q_tail;
mq_t *srf_write_q, *srf_write_q_tail;
} sr_fd_t;
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:
eth0 DP8390 0 { default; };
psip1;
an element (i.e., a struct sr_fd) is configured for each of the following devices:
/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]
/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]
sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).
When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().
int srf_flags:
srf_flags is a combination of the following:
#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP
srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.
When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.
int srf_fd, srf_port:
srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:
if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4
For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.
Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).
srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.
sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:
The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.
mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:
The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).
After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().
0015073
0015074 #define SFF_FLAGS 0x0F
0015075 # define SFF_FREE 0x00
0015076 # define SFF_MINOR 0x01
0015077 # define SFF_INUSE 0x02
0015078 # define SFF_BUSY 0x3C
0015079 # define SFF_IOCTL_IP 0x04
0015080 # define SFF_READ_IP 0x08
0015081 # define SFF_WRITE_IP 0x10
0015082 # define SFF_PENDING_REQ 0x30
0015083 # define SFF_SUSPENDED 0x1C0
0015084 # define SFF_IOCTL_SUSP 0x40
0015085 # define SFF_READ_SUSP 0x80
0015086 # define SFF_WRITE_SUSP 0x100
0015087
The "SFF" above stands for "Send/Receive File Descriptor Flags". These flags determine the state of the file descriptor.
0015088 FORWARD _PROTOTYPE ( int sr_open, (message *m) );
0015089 FORWARD _PROTOTYPE ( void sr_close, (message *m) );
0015090 FORWARD _PROTOTYPE ( int sr_rwio, (mq_t *m) );
0015091 FORWARD _PROTOTYPE ( int sr_cancel, (message *m) );
0015092 FORWARD _PROTOTYPE ( void sr_reply, (mq_t *m, int reply, int can_enqueue) );
0015093 FORWARD _PROTOTYPE ( sr_fd_t *sr_getchannel, (int minor));
0015094 FORWARD _PROTOTYPE ( acc_t *sr_get_userdata, (int fd, vir_bytes offset,
0015095 vir_bytes count, int for_ioctl) );
0015096 FORWARD _PROTOTYPE ( int sr_put_userdata, (int fd, vir_bytes offset,
0015097 acc_t *data, int for_ioctl) );
0015098 FORWARD _PROTOTYPE ( int sr_repl_queue, (int proc, int ref, int operation) );
0015099 FORWARD _PROTOTYPE ( int walk_queue, (sr_fd_t *sr_fd, mq_t *q_head,
0015100 mq_t **q_tail_ptr, int type, int proc_nr, int ref) );
0015101 FORWARD _PROTOTYPE ( void process_req_q, (mq_t *mq, mq_t *tail,
0015102 mq_t **tail_ptr) );
0015103 FORWARD _PROTOTYPE ( int cp_u2b, (int proc, char *src, acc_t **var_acc_ptr,
0015104 int size) );
0015105 FORWARD _PROTOTYPE ( int cp_b2u, (acc_t *acc_ptr, int proc, char *dest) );
0015106
0015107 PRIVATE sr_fd_t sr_fd_table[FD_NR];
sr_fd / sr_fd_table[] / sr_add_minor()
One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):
typedef struct sr_fd
{
int srf_flags;
int srf_fd;
int srf_port;
sr_open_t srf_open;
sr_close_t srf_close;
sr_write_t srf_write;
sr_read_t srf_read;
sr_ioctl_t srf_ioctl;
sr_cancel_t srf_cancel;
mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
mq_t *srf_read_q, *srf_read_q_tail;
mq_t *srf_write_q, *srf_write_q_tail;
} sr_fd_t;
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:
eth0 DP8390 0 { default; };
psip1;
an element (i.e., a struct sr_fd) is configured for each of the following devices:
/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]
/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]
sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).
When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().
int srf_flags:
srf_flags is a combination of the following:
#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP
srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.
When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.
int srf_fd, srf_port:
srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:
if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4
For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.
Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).
srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.
sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:
The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.
mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:
The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).
After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().
0015108 PRIVATE mq_t *repl_queue, *repl_queue_tail;
repl_queue / repl_queue_tail
repl_queue is the head of the linked list (repl_queue_tail is the tail) of REVIVE messages that had been previously (unsuccessfully) sent to the file system (FS) but had not been delivered due to deadlock. Before any new messages are processed by the network service, a second (and last) attempt is made by send_repl_queue() to send out these REVIVE messages in repl_queue.
For a detailed description of what circumstances lead to a deadlock, click here.
0015109 PRIVATE cpvec_t cpvec[CPVEC_NR];
0015110
0015111 PUBLIC void sr_init()
0015112 {
0015113 #if ZERO
0015114 int i;
0015115
0015116 for (i=0; i<FD_NR; i++)
FD_NR for the 80386 is 64 (see line 15056). It is the number of elements in sr_fd_table[].
A few of these elements are filled in later by calls to sr_add_minor() (which is called by a client's initialization function - e.g., ip_init()).
0015117 sr_fd_table[i].srf_flags= SFF_FREE;
0015118 repl_queue= NULL;
0015119 #endif
0015120 }
0015121
0015122 PUBLIC void sr_rec(m)
0015123 mq_t *m;
sr_rec()
The network service receives messages from the file system (FS), the ethernet task, the clock task, and the asynchronous alarm task. sr_rec() is called upon the receipt of a message from the file system in the endless loop within main().
If there are any messages in repl_queue, sr_rec() first calls sr_repl_queue() to process the messages in repl_queue before calling sr_open() (for open requests), sr_close() (for close requests), sr_rwio() (for read, write, and io requests), or sr_cancel() (for cancel requests) to further process the message.
Note that "rec" in "sr_rec" stands for "receive".
0015124 {
0015125 int result;
0015126 int send_reply, free_mess;
0015127
0015128 if (repl_queue)
Before any new messages can be processed, the messages in the queue of reply messages (repl_queue) must be sent by sr_repl_queue() (see lines 15132 and 15140). If the message is a cancellation message (NW_CANCEL), sr_repl_queue() sends the cancellation message after all the other reply messages have been sent.
All reply messages are sent to the file system.
repl_queue / repl_queue_tail
repl_queue is the head of the linked list (repl_queue_tail is the tail) of REVIVE messages that had been previously (unsuccessfully) sent to the file system (FS) but had not been delivered due to deadlock. Before any new messages are processed by the network service, a second (and last) attempt is made by send_repl_queue() to send out these REVIVE messages in repl_queue.
For a detailed description of what circumstances lead to a deadlock, click here.
0015129 {
0015130 if (m->mq_mess.m_type == NW_CANCEL)
0015131 {
0015132 result= sr_repl_queue(m->mq_mess.PROC_NR, 0, 0);
sr_repl_queue()
If repl_queue is not empty, sr_repl_queue(proc, ref, operation) is called to send all messages (which will be REVIVE messages) in repl_queue to the file system. If called with proc, sr_repl_queue()'s first parameter, set to ANY (#define'd in include/minix/com.h - this value is not within allowable range of process numbers), sr_repl_queue() goes through repl_queue and sends each reply message to the appropriate process. If called with proc set to a specific process number (which is the case when sr_rec() is processing an NW_CANCEL message), sr_repl_queue() goes through repl_queue and sends every REVIVE message except for any messages destined for the process specified by proc (if any exist). sr_repl_queue() finally sends the message destined to the process specified by proc after all other messages have been sent. While the file system is still interested in the REVIVE message, the priority goes to other replies.
In addition to sending all the reply messages queued in repl_queue, sr_repl_queue() frees the messages in repl_queue for later use and sets repl_queue to NULL.
A return value of 1 indicates that a message in repl_queue was canceled. A return value of 0 indicates that no messages were canceled.
The second and third parameters (ref and operation) are never used in sr_repl_queue().
0015133 if (result)
A REVIVE message destined to the same process as the origin of the NW_CANCEL message just received was found in repl_queue. Since the NW_CANCEL message arrived too late, simply free the NW_CANCEL message and return.
0015134 {
0015135 mq_free(m);
mq_free()
mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.
Click here for a detailed description of mq_list[] and mq_freelist.
0015136 return; /* canceled request in queue */
0015137 }
0015138 }
0015139 else
0015140 sr_repl_queue(ANY, 0, 0);
sr_repl_queue()
If repl_queue is not empty, sr_repl_queue(proc, ref, operation) is called to send all messages (which will be REVIVE messages) in repl_queue to the file system. If called with proc, sr_repl_queue()'s first parameter, set to ANY (#define'd in include/minix/com.h - this value is not within allowable range of process numbers), sr_repl_queue() goes through repl_queue and sends each reply message to the appropriate process. If called with proc set to a specific process number (which is the case when sr_rec() is processing an NW_CANCEL message), sr_repl_queue() goes through repl_queue and sends every REVIVE message except for any messages destined for the process specified by proc (if any exist). sr_repl_queue() finally sends the message destined to the process specified by proc after all other messages have been sent. While the file system is still interested in the REVIVE message, the priority goes to other replies.
In addition to sending all the reply messages queued in repl_queue, sr_repl_queue() frees the messages in repl_queue for later use and sets repl_queue to NULL.
A return value of 1 indicates that a message in repl_queue was canceled. A return value of 0 indicates that no messages were canceled.
The second and third parameters (ref and operation) are never used in sr_repl_queue().
0015141 }
0015142
0015143 switch (m->mq_mess.m_type)
Hand the message off to the request-specific function.
messages from the FS
In order to communicate with one another, the services (the file system (FS), the memory manager (MM), and the network service), the kernel, and the user processes send messages back and forth. These messages take the following form:
typedef struct {
int m_source; /* who sent the message */
int m_type; /* what kind of message is it */
union {
mess_1 m_m1;
mess_2 m_m2;
mess_3 m_m3;
mess_4 m_m4;
mess_5 m_m5;
mess_6 m_m6;
} m_u;
} message;
typedef struct {int m1i1, m1i2, m1i3; char *m1p1, *m1p2, *m1p3;} mess_1;
typedef struct {int m2i1, m2i2, m2i3; long m2l1, m2l2; char *m2p1;} mess_2;
typedef struct {int m3i1, m3i2; char *m3p1; char m3ca1[M3_STRING];} mess_3;
typedef struct {long m4l1, m4l2, m4l3, m4l4, m4l5;} mess_4;
typedef struct {char m5c1, m5c2; int m5i1, m5i2; long m5l1, m5l2, m5l3;}mess_5;
typedef struct {int m6i1, m6i2, m6i3; long m6l1; sighandler_t m6f1;} mess_6;
(In the names of the fields above, "i" stands for integer, "l" stands for long, and "p" stands for pointer.)
The message types (i.e., m_type) that the FS sends will be DEV_OPEN, DEV_CLOSE, DEV_READ, DEV_WRITE, DEV_IOCTL3, and DEV_CANCEL. For example, if a user process opens the /dev/udp file, the FS sends a message of type "DEV_OPEN" to the network service. The type of message that the network service always sends the FS is DEVICE_REPLY.
Recall that only a single field within a union is valid. When the file system (FS) receives a request (open, read, write, etc.) for one of the device files associated with the network service (/dev/udp, /dev/tcp, /dev/ip, etc.), the FS sends a message with an m_u of mess_2 to the network service. Repeated from above, the mess_2 typedef is declared as follows:
typedef struct {int m2i1, m2i2, m2i3; long m2l1, m2l2; char *m2p1;} mess_2;
m_source m_type DEVICE (m2_i1) PROC_NR (m2_i2) COUNT or REQUEST (m2_i3) POSITION (m2_l1) TTY_FLAGS (m2_l2) ADDRESS (m2_p1)
Note: The field POSITION is marked gray because it is not used here and TTY_FLAGS is marked gray because it is always false. This denotes that it is in blocking mode (in other words, the process is waiting for a response).
The meaning of each of these fields is as follows:
DEVICE is the minor number of the device.
PROC_NR is the process number of the client.
COUNT is the number of bytes to transfer.
REQUEST is specific to ioctl operations. An example of an ioctl operation is NWIOSUDPOPT (NetWork IO Set UDP OPTions).
POSITION is the position in the device file.
TTY_FLAGS is zero (blocking mode).
ADDRESS is a pointer to a buffer. This buffer will either be read from or written to by the network service.
It is interesting to compare the fields above with the parameters for dev_io (from the file system):
PUBLIC int dev_io(op, nonblock, dev, pos, bytes, proc, buff)
int op; /* DEV_READ, DEV_WRITE, DEV_IOCTL, etc. */
int nonblock; /* TRUE if nonblocking op */
dev_t dev; /* major-minor device number */
off_t pos; /* byte position */
int bytes; /* how many bytes to transfer */
int proc; /* in whose address space is buff? */
char *buff; /* virtual address of the buffer */
The format of reply messages is different:
m_source DEVICE_REPLY proc nr status fd operation unused unused
The message queue is a linked list of mq_t struct's:
typedef struct mq
{
message mq_mess;
struct mq *mq_next;
int mq_allocated;
} mq_t;
Immediately after initialization, the message queue is as follows:
In preparation for receiving a message, mq_get() is called to remove one of the mq_t structs in the linked list from the free list of messages. This link will hold the next message received and will be passed to either the code that handles the messages from the file system, the asynchronous alarm task, or the ethernet task.
0015144 {
0015145 case DEV_OPEN:
0015146 result= sr_open(&m->mq_mess);
sr_open()
sr_open() examines the minor number of the message (the DEVICE field, more specifically; this will reflect which file was opened - for example, if the DEVICE field is equal to 4, /dev/udp0 was opened) and copies the sr_fd_table[] element whose index corresponds to this minor number to the first unused entry of the table (thereby opening a channel). sr_open() sets the element's flag to SFF_INUSE and then calls the protocol specific open function (e.g., udp_open()) for the channel (a pointer to this function is found in the srf_open field of the element). This function's return value sets the element's file descriptor field (srf_fd). Finally sr_open() returns the index of the new element.
0015147 send_reply= 1;
0015148 free_mess= 1;
Read, write, and ioctl operations free their own messages. Open, close, and cancel operations do not free messages. Therefore, free_mess equals zero for read, write, and ioctl operations and free_mess equals one for open, close, and cancel operations.
As an example of a read/write/ioctl operation freeing messages, consider the following (straight-forward) scenario where there are no messages in the read message queue of a file descriptor (i.e., srf_read_q, srf_read_q_tail == null) but data was received (perhaps by the ethernet task) and the file descriptor was opened by the udp client. If a message arrives requesting a read operation (i.e., m->mq_mess.m_type == DEV_READ), udp_read() is called, which in turn calls udp_packet2user(). udp_packet2user() calls sr_put_userdata() twice. The first time sr_put_userdata() is called, data (which was previously received) is transferred to the user process. The second time that sr_put_userdata() is called, a reply is sent to the user process by sr_reply(), which ultimately calls mq_free() to free the message.
Below is a more complex example of message releasing. In this example, no read messages are waiting in the read queue of a file descriptor opened by the udp client. However, this time, there is no data waiting to be read.
A DEV_READ message (message 1) arrives and sr_rec() calls sr_rwio() to handle the message. Since there are no messages in the read queue, SFF_READ_IP is not initially set. After setting the SFF_READ_IP flag, sr_rwio() calls udp_read(). However, since no data is waiting to be read, udp_read() returns NW_SUSPEND. sr_rwio() then sets the SFF_READ_SUSP flag and returns NW_SUSPEND to sr_rec(), which then calls sr_reply() to send a REVIVE message back to the file system. In other words, sr_reply() instructs the FS to unblock the process requesting the read.
A second DEV_READ message (message 2) arrives and sr_rec() again calls sr_rwio(). Because the SFF_READ_IP flag is already raised by the first message, the second message is appended to the read queue and sr_rwio() returns NW_SUSPEND. Again, sr_reply() sends a REVIVE message to the file system.
A third DEV_READ message (message 3) arrives and, as before, is placed at the end of the read queue and a REVIVE message is sent to the file system.
Finally, some data arrives (possibly from the ethernet task). udp_packet2user() is ultimately called and it, in turn, calls sr_put_userdata() twice: the first time to deliver the packet to the buffer of the user process and the second time to return the number of bytes delivered to the user process. The second time sr_put_userdata() is called, it calls sr_reply(), which sets the read queue head to null.
After sr_reply() returns, sr_put_userdata() then calls process_req_q(), passing in mq as the first parameter. process_req_q() goes through the remaining messages, starting with mq (see figure above), handling the read requests if possible. If there is enough data to satisfy all of the read requests (i.e., the requests of messages 2 and 3), the read queue is as appears in the figure on the left. If, for example, there is not even enough data to satisfy the request of message 2, the message queue is as appears in the figure on the right.
If there was enough data to satisfy the requests of messages 2 and 3, the remaining messages are freed by sr_reply(), just as message 1 was previously freed.
The same principle is also true for send_reply. If the operation itself sends a reply (by calling sr_reply()), there is no need to call it. Open, close, and cancel operations do not send replies and, therefore, a reply (see line 15179) for these operations is always necessary. Read, write, and ioctl send replies for themselves if the operation is successful. However, if the read/write/ioctl operation is not successful (i.e., the operation is suspended), a reply must be sent.
0015149 break;
0015150 case DEV_CLOSE:
0015151 sr_close(&m->mq_mess);
sr_close()
sr_close() closes a channel by calling the channel's protocol-specific close function (e.g., udp_close()) and marking the channel's corresponding element in sr_fd_table[] as free.
0015152 result= OK;
0015153 send_reply= 1;
0015154 free_mess= 1;
0015155 break;
0015156 case DEV_READ:
0015157 case DEV_WRITE:
0015158 case DEV_IOCTL:
0015159 result= sr_rwio(m);
sr_rwio()
sr_rwio() is either called by sr_rec() upon receipt of a read, write, or ioctl request or by process_req_q() to handle queued messages. The handling of a read, write, or ioctl request is a complex operation that is best explained by an example. We will analyze two different read requests, one with data ready to be delivered to user processes and one with no data ready to be delivered.
SCENARIO 1: There is data ready to be delivered to the user process and there are no messages in the read queue (i.e., srf_read_q, srf_read_q_tail == null). The file descriptor was opened by the udp client.
When a message arrives requesting a read operation (i.e., m->mq_mess.m_type == DEV_READ), udp_read() is called, which in turn calls udp_packet2user(). udp_packet2user() calls sr_put_userdata() twice. The first time sr_put_userdata() is called, the data is transferred to the user process. The second time sr_put_userdata() is called, a reply is sent to the user process by sr_reply(), which ultimately calls mq_free(), freeing the message.
SCENARIO 2: There are no read messages waiting in the read queue of a file descriptor opened by the udp client. However, this time, there is no data waiting to be read.
A DEV_READ message (message 1) arrives and sr_rec() calls sr_rwio() to handle the message. Since there are no messages in the read queue, SFF_READ_IP is not raised. After setting the SFF_READ_IP flag, sr_rwio() calls udp_read(). However, since no data is waiting to be read, udp_read() returns NW_SUSPEND. sr_rwio() then sets the SFF_READ_SUSP flag and returns NW_SUSPEND to sr_rec(), which then sends a REVIVE message back to the file system (in other words, it instructs the FS to revive the process requesting the read). After this, the sr file descriptor is shown:
A second DEV_READ message (message 2) arrives and sr_rec() again calls sr_rwio(). Because the SFF_READ_IP flag is already raised by the first message, the second message is appended to the read queue and sr_rwio() returns NW_SUSPEND. Again, sr_reply() sends a REVIVE message to the file system.
A third DEV_READ message (message 3) arrives and, as before, is placed at the end of the read queue and a REVIVE message is sent to the file system.
Finally, some data arrives (possibly from the ethernet task). udp_packet2user() is ultimately called and it, in turn, calls sr_put_userdata() twice: the first time to deliver the packet to the buffer of the user process and the second time to return the number of bytes delivered to the user process. The second time sr_put_data() is called, it calls sr_reply(), which sets the read queue head to null.
After sr_reply() returns, sr_put_userdata() then calls process_req_q(), passing in mq as the first parameter. process_req_q() goes through the remaining messages, starting with mq (see figure above), handling the read requests (by calling sr_rwio()) if possible. If there is enough data to satisfy all of the read requests (i.e., the requests of messages 2 and 3), the read queue is as appears in the figure on the left. If, for example, there is not even enough data to satisfy the request of message 2, the message queue is as appears in the figure on the right.
If there was enough data to satisfy the requests of messages 2 and 3, the remaining messages are freed by sr_reply(), just as message 1 was previously freed. Note that the sr_read_q field in the right-hand figure above was set by sr_rwio() before returning.
0015160 assert(result == OK || result == SUSPEND);
0015161 send_reply= (result == SUSPEND);
0015162 free_mess= 0;
0015163 break;
0015164 case CANCEL:
0015165 result= sr_cancel(&m->mq_mess);
sr_cancel()
Unlike sr_open(), sr_close(), and sr_rwio(), sr_cancel() does not call the protocol specific cancel function directly. Instead, sr_cancel() calls walk_queue(), which walks the message queue for the given channel (sr_fd) for each of the queues (read, write, and ioctl) until it finds a message with the same process number.
If the message is the first message in the queue, walk_queue() calls the protocol specific cancel function (see line 15422). If a message is found but it is not the first message in the queue, walk_queue() removes the message from the queue and returns EINTR. If a message is not found, walk_queue() returns EAGAIN (which is a problem).
0015166 assert(result == OK || result == EINTR);
0015167 send_reply= (result == EINTR);
0015168 free_mess= 1;
0015169 m->mq_mess.m_type= 0;
0015170 break;
0015171 #if !CRAMPED
0015172 default:
0015173 ip_panic(("unknown message, from %d, type %d",
0015174 m->mq_mess.m_source, m->mq_mess.m_type));
0015175 #endif
0015176 }
0015177 if (send_reply)
0015178 {
0015179 sr_reply(m, result, FALSE);
sr_reply()
sr_reply(mq, status, can_enqueue) attempts to send a REVIVE message back to the file system (FS) (which is what the FS expects to receive). sr_reply() is called from sr_rec() and from sr_get_userdata() and sr_put_userdata().
If called from sr_rec(), there are two scenarios. If the message mq (sr_reply()'s first parameter) was a request for a DEV_OPEN, DEV_CLOSE, or CANCEL, a reply is guaranteed to succeed and therefore there is no need to enqueue the message in repl_queue. If the message mq was a request for a DEV_READ, DEV_WRITE, or DEV_IOCTL operation that was suspended, a reply is also guaranteed to succeed. In both of these scenarios, it is not necessary to queue the message in repl_queue and, therefore, can_enqueue is FALSE if called from sr_rec().
However, if sr_reply() is called from sr_get_userdata() or sr_put_userdata(), the message may fail due to deadlock. Also, the message must eventually reach the file system so that the user process may be unblocked. Therefore, the reply message must be placed in repl_queue if the message doesn't succeed due to deadlock. For this reason, can_enqueue is TRUE if called from sr_get_userdata() or sr_put_userdata().
It is important to understand why deadlocks can occur. During the initialization of the network service, the network service calls svrctl() (which calls do_svrctl()) to initialize itself with the FS. do_svrctl() sets the fields of the network service's dmap[] entry to gen_io() for read, write, and ioctl requests and clone_opcl() for open and close requests. For this reason, whenever a read, write, or ioctl requests for a network file is received by the file system, gen_io() is called and whenever an open or close request is received by the file system, clone_opcl() is called (note that clone_opcl() calls gen_io()).
If gen_io() calls sendrec() to send a message to the network service and there is a deadlock because the the network service also attempted to send a message to the file system (i.e., sendrec() returned ELOCKED), gen_io() will call receive() instead.
If the read, write, or ioctl request could not be immediately satisfied, the network service will later send() another REVIVE message after the read, write, or ioctl operation has completed (successfully or unsuccessfully). However, the file system will not necessarily handle any deadlock that occurs (since
the main loop in the file system may possibly handle the message) and so, if a deadlock occurs, the network service must later attempt to send the REVIVE message that was not successful due to the deadlock. The network service does this by placing the message in the repl_queue. The messages in repl_queue will be sent again the next time that sr_repl_queue() is called.
0015180 }
0015181 if (free_mess)
0015182 mq_free(m);
mq_free()
mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.
Click here for a detailed description of mq_list[] and mq_freelist.
0015183 }
0015184
0015185 PUBLIC void sr_add_minor(minor, port, openf, closef, readf, writef,
0015186 ioctlf, cancelf)
0015187 int minor;
0015188 int port;
0015189 sr_open_t openf;
0015190 sr_close_t closef;
0015191 sr_read_t readf;
0015192 sr_write_t writef;
0015193 sr_ioctl_t ioctlf;
0015194 sr_cancel_t cancelf;
sr_fd / sr_fd_table[] / sr_add_minor()
One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):
typedef struct sr_fd
{
int srf_flags;
int srf_fd;
int srf_port;
sr_open_t srf_open;
sr_close_t srf_close;
sr_write_t srf_write;
sr_read_t srf_read;
sr_ioctl_t srf_ioctl;
sr_cancel_t srf_cancel;
mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
mq_t *srf_read_q, *srf_read_q_tail;
mq_t *srf_write_q, *srf_write_q_tail;
} sr_fd_t;
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:
eth0 DP8390 0 { default; };
psip1;
an element (i.e., a struct sr_fd) is configured for each of the following devices:
/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]
/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]
sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).
When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().
int srf_flags:
srf_flags is a combination of the following:
#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP
srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.
When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.
int srf_fd, srf_port:
srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:
if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4
For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.
Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).
srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.
sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:
The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.
mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:
The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).
After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().
0015195 {
0015196 sr_fd_t *sr_fd;
0015197
sr_add_minor() checks to make sure that the minor device number is an acceptable value (i.e., less than 64 - there can be 16 minor devices for a maximum of 4 interfaces) and also verifies that the element is not already in use (line 15202).
sr_add_minor() then marks the element as used (line15204) and fills in all the fields of the element (for example, sr_fd_table[4] is filled in for /dev/udp0). The various function fields (e.g., the closef field - see line 15207) are filled in with the protocol-specific function (e.g., udp_close()).
The port number is the position of each device for a given protocol in the /etc/inet.conf file. For example, if there are two ethernet devices specified in the inet.conf file, the first ethernet device will have a port number of 0 and the second ethernet device will have a port number of 1. (See line 10086 in mnx_eth.c.)
0015198 assert (minor>=0 && minor<FD_NR);
0015199
0015200 sr_fd= &sr_fd_table[minor];
0015201
0015202 assert(!(sr_fd->srf_flags & SFF_INUSE));
0015203
0015204 sr_fd->srf_flags= SFF_INUSE | SFF_MINOR;
0015205 sr_fd->srf_port= port;
0015206 sr_fd->srf_open= openf;
0015207 sr_fd->srf_close= closef;
0015208 sr_fd->srf_write= writef;
0015209 sr_fd->srf_read= readf;
0015210 sr_fd->srf_ioctl= ioctlf;
0015211 sr_fd->srf_cancel= cancelf;
0015212 }
0015213
0015214 PRIVATE int sr_open(m)
0015215 message *m;
sr_open()
sr_open() examines the minor number of the message (the DEVICE field, more specifically; this will reflect which file was opened - for example, if the DEVICE field is equal to 4, /dev/udp0 was opened) and copies the sr_fd_table[] element whose index corresponds to this minor number to the first unused entry of the table (thereby opening a channel). sr_open() sets the element's flag to SFF_INUSE and then calls the protocol specific open function (e.g., udp_open()) for the channel (a pointer to this function is found in the srf_open field of the element). This function's return value sets the element's file descriptor field (srf_fd). Finally sr_open() returns the index of the new element.
0015216 {
0015217 sr_fd_t *sr_fd;
0015218
0015219 int minor= m->DEVICE;
0015220 int i, fd;
0015221
0015222 if (minor<0 || minor>FD_NR)
Verify that the minor device number is between 0 and 63 (for the 80386, there can be 64 channels).
EINVAL (defined in include/errno.h) stands for "invalid argument".
0015223 {
0015224 DBLOCK(1, printf("replying EINVAL\n"));
0015225 return EINVAL;
0015226 }
0015227 if (!(sr_fd_table[minor].srf_flags & SFF_MINOR))
Verify that the sr_fd_table[] element refers to a minor device.
sr_fd / sr_fd_table[] / sr_add_minor()
One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):
typedef struct sr_fd
{
int srf_flags;
int srf_fd;
int srf_port;
sr_open_t srf_open;
sr_close_t srf_close;
sr_write_t srf_write;
sr_read_t srf_read;
sr_ioctl_t srf_ioctl;
sr_cancel_t srf_cancel;
mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
mq_t *srf_read_q, *srf_read_q_tail;
mq_t *srf_write_q, *srf_write_q_tail;
} sr_fd_t;
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:
eth0 DP8390 0 { default; };
psip1;
an element (i.e., a struct sr_fd) is configured for each of the following devices:
/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]
/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]
sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).
When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().
int srf_flags:
srf_flags is a combination of the following:
#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP
srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.
When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.
int srf_fd, srf_port:
srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:
if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4
For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.
Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).
srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.
sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:
The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.
mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:
The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).
After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().
0015228 {
Verify that the sr_fd_table[] element refers to a minor device).
ENXIO (defined in include/errno.h) stands for "no such device or address" (in our case, "no such device" is relevant).
0015229 DBLOCK(1, printf("replying ENXIO\n"));
0015230 return ENXIO;
0015231 }
0015232 for (i=0; i<FD_NR && (sr_fd_table[i].srf_flags & SFF_INUSE); i++);
Find the first sr_fd_table[] element that is not in use. i is the index of that element and is used in the following lines.
0015233
0015234 if (i>=FD_NR)
0015235 {
0015236 DBLOCK(1, printf("replying ENFILE\n"));
0015237 return ENFILE;
0015238 }
0015239
0015240 sr_fd= &sr_fd_table[i];
0015241 *sr_fd= sr_fd_table[minor];
The sr_fd struct with the minor device number minor (for example, sr_fd_table[4] for /dev/udp0) is copied to the open slot (i.e., sr_fd_table[i]).
0015242 sr_fd->srf_flags= SFF_INUSE;
The element is marked as in use. Note that the element is not marked as a minor device (SFF_MINOR). Also note that srf_port is the same as the minor device just copied. For example, if a process opens up /dev/upd1, srf_port will be 1 (see comment for line 15002).
When the device is closed, srf_flags is set to SFF_FREE (see line 15267).
0015243 fd= (*sr_fd->srf_open)(sr_fd->srf_port, i, sr_get_userdata,
0015244 sr_put_userdata, 0);
Call the protocol-specific open function. For example, the open function for the udp protocol is udp_open() and for tcp is tcp_open(). These functions return a protocol-specific file descriptor which, like the sr file descriptors (sr_fd), are simply elements in a table (e.g., udp_fd_table[]). srf_fd for the new channel is set to this value (see line 15251).
udp_open()
udp_open(port, srfd, get_getuserdata, put_userdata, put_pkt), called from sr_open(), finds the first unused element in udp_fd_table[], configures this udp file descriptor with the arguments passed in, and returns the index of the (newly configured) element within udp_fd_table[].
The parameters are explained below:
When a process opens one of the udp devices (e.g., /dev/udp), sr_open() claims an unused element in sr_fd_table[] and copies the element of sr_fd_table[] that corresponds to the device to this unused element. This element number is then passed in as the second argument (srfd). Later, when the process wishes to read or write or perform ioctl on this open file descriptor, it includes this sr file descriptor in the message that it sends the file system.
For the following example inet.conf file:
eth0 DP8390 0 { default; };
psip1;
there are 2 ports. Port 0 corresponds to the ethernet device and port 1 corresponds to the psip port.
get_userdata is a pointer to the function that gets the data from the process. This function will always be sr_get_userdata().
sr_put_userdata is a pointer to the function that copies data into the user process. This function will always be sr_put_userdata().
The put_pkt parameter is not used in this function.
0015245 if (fd<0)
0015246 {
0015247 sr_fd->srf_flags= SFF_FREE;
0015248 DBLOCK(1, printf("replying %d\n", fd));
0015249 return fd;
0015250 }
0015251 sr_fd->srf_fd= fd;
This is very important. As described in the sr_fd general comment, this allows the code to find the corresponding entry in the protocol-specific file descriptor table (e.g., udp_fd_table[]).
0015252 return i;
Return the index of the newly opened channel. This value is ultimately sent back to the calling process (see line 15179).
0015253 }
0015254
0015255 PRIVATE void sr_close(m)
0015256 message *m;
sr_close()
sr_close() closes a channel by calling the channel's protocol-specific close function (e.g., udp_close()) and marking the channel's corresponding element in sr_fd_table[] as free.
0015257 {
0015258 sr_fd_t *sr_fd;
0015259
0015260 sr_fd= sr_getchannel(m->DEVICE);
sr_getchannel()
For a given channel (i.e., an index within sr_fd_table[]), sr_getchannel(minor) returns a pointer to the corresponding sr_fd element.
Note that minor, sr_getchannel()'s only parameter, does not correspond to a minor device. Instead, minor corresponds to a normal slot within sr_fd_table[].
0015261 assert (sr_fd);
0015262
0015263 assert (!(sr_fd->srf_flags & SFF_BUSY));
0015264
0015265 assert (!(sr_fd->srf_flags & SFF_MINOR));
0015266 (*sr_fd->srf_close)(sr_fd->srf_fd);
Call the protocol-specific close function for the given channel. For example, if the channel is a udp channel, udp_close() is called. Note that srf_close was set by sr_add_minor().
Mark the slot as free so that another channel can later claim the slot.
0015267 sr_fd->srf_flags= SFF_FREE;
0015268 }
0015269
0015270 PRIVATE int sr_rwio(m)
0015271 mq_t *m;
sr_rwio()
sr_rwio() is either called by sr_rec() upon receipt of a read, write, or ioctl request or by process_req_q() to handle queued messages. The handling of a read, write, or ioctl request is a complex operation that is best explained by an example. We will analyze two different read requests, one with data ready to be delivered to user processes and one with no data ready to be delivered.
SCENARIO 1: There is data ready to be delivered to the user process and there are no messages in the read queue (i.e., srf_read_q, srf_read_q_tail == null). The file descriptor was opened by the udp client.
When a message arrives requesting a read operation (i.e., m->mq_mess.m_type == DEV_READ), udp_read() is called, which in turn calls udp_packet2user(). udp_packet2user() calls sr_put_userdata() twice. The first time sr_put_userdata() is called, the data is transferred to the user process. The second time sr_put_userdata() is called, a reply is sent to the user process by sr_reply(), which ultimately calls mq_free(), freeing the message.
SCENARIO 2: There are no read messages waiting in the read queue of a file descriptor opened by the udp client. However, this time, there is no data waiting to be read.
A DEV_READ message (message 1) arrives and sr_rec() calls sr_rwio() to handle the message. Since there are no messages in the read queue, SFF_READ_IP is not raised. After setting the SFF_READ_IP flag, sr_rwio() calls udp_read(). However, since no data is waiting to be read, udp_read() returns NW_SUSPEND. sr_rwio() then sets the SFF_READ_SUSP flag and returns NW_SUSPEND to sr_rec(), which then sends a REVIVE message back to the file system (in other words, it instructs the FS to revive the process requesting the read). After this, the sr file descriptor is shown:
A second DEV_READ message (message 2) arrives and sr_rec() again calls sr_rwio(). Because the SFF_READ_IP flag is already raised by the first message, the second message is appended to the read queue and sr_rwio() returns NW_SUSPEND. Again, sr_reply() sends a REVIVE message to the file system.
A third DEV_READ message (message 3) arrives and, as before, is placed at the end of the read queue and a REVIVE message is sent to the file system.
Finally, some data arrives (possibly from the ethernet task). udp_packet2user() is ultimately called and it, in turn, calls sr_put_userdata() twice: the first time to deliver the packet to the buffer of the user process and the second time to return the number of bytes delivered to the user process. The second time sr_put_data() is called, it calls sr_reply(), which sets the read queue head to null.
After sr_reply() returns, sr_put_userdata() then calls process_req_q(), passing in mq as the first parameter. process_req_q() goes through the remaining messages, starting with mq (see figure above), handling the read requests (by calling sr_rwio()) if possible. If there is enough data to satisfy all of the read requests (i.e., the requests of messages 2 and 3), the read queue is as appears in the figure on the left. If, for example, there is not even enough data to satisfy the request of message 2, the message queue is as appears in the figure on the right.
If there was enough data to satisfy the requests of messages 2 and 3, the remaining messages are freed by sr_reply(), just as message 1 was previously freed. Note that the sr_read_q field in the right-hand figure above was set by sr_rwio() before returning.
0015272 {
0015273 sr_fd_t *sr_fd;
0015274 mq_t **q_head_ptr, **q_tail_ptr;
0015275 int ip_flag, susp_flag;
0015276 int r;
0015277 ioreq_t request;
request is only used for ioctl requests (see line 15334).
ioreq_t is declared in net/inet.h as:
typedef int ioreq_t;
0015278 size_t size;
0015279
0015280 sr_fd= sr_getchannel(m->mq_mess.DEVICE);
sr_getchannel()
For a given channel (i.e., an index within sr_fd_table[]), sr_getchannel(minor) returns a pointer to the corresponding sr_fd element.
Note that minor, sr_getchannel()'s only parameter, does not correspond to a minor device. Instead, minor corresponds to a normal slot within sr_fd_table[].
0015281 assert (sr_fd);
0015282
For each message type, use the appropriate message queue and the appropriate flags. This queue and these flags will be used in the block that begins on line 15309.
0015283 switch(m->mq_mess.m_type)
0015284 {
0015285 case DEV_READ:
0015286 q_head_ptr= &sr_fd->srf_read_q;
0015287 q_tail_ptr= &sr_fd->srf_read_q_tail;
0015288 ip_flag= SFF_READ_IP;
0015289 susp_flag= SFF_READ_SUSP;
0015290 break;
0015291 case DEV_WRITE:
0015292 q_head_ptr= &sr_fd->srf_write_q;
0015293 q_tail_ptr= &sr_fd->srf_write_q_tail;
0015294 ip_flag= SFF_WRITE_IP;
0015295 susp_flag= SFF_WRITE_SUSP;
0015296 break;
0015297 case DEV_IOCTL:
0015298 q_head_ptr= &sr_fd->srf_ioctl_q;
0015299 q_tail_ptr= &sr_fd->srf_ioctl_q_tail;
0015300 ip_flag= SFF_IOCTL_IP;
0015301 susp_flag= SFF_IOCTL_SUSP;
0015302 break;
0015303 #if !CRAMPED
0015304 default:
0015305 ip_panic(("illegal case entry"));
0015306 #endif
0015307 }
0015308
0015309 if (sr_fd->srf_flags & ip_flag)
If the channel is already servicing a similar request, put the message at the tail of the appropriate message queue and return SUSPEND.
0015310 {
0015311 assert(sr_fd->srf_flags & susp_flag);
0015312 assert(*q_head_ptr);
0015313
0015314 (*q_tail_ptr)->mq_next= m;
0015315 *q_tail_ptr= m;
0015316 return SUSPEND;
0015317 }
0015318 assert(!*q_head_ptr);
0015319
0015320 *q_tail_ptr= *q_head_ptr= m;
If this point in the code is reached, the relevant message queue was empty. Therefore, the head and the tail of the relevant message queue is set to the message just received.
For a read request, set srf_flags to SFF_READ_IP. For a write request, set srf_flags to SFF_WRITE_IP. For an ioctl request, set srf_flags to SFF_IOCTL_IP (see lines 15288, 15294, and 15300).
0015321 sr_fd->srf_flags |= ip_flag;
0015322
0015323 switch(m->mq_mess.m_type)
Call the appropriate function. For example, for a udp channel, call udp_read(), udp_write(), or udp_ioctl().
sr_fd->srf_fd is the corresponding slot in in the protocol-specific file descriptor table (e.g., udp_fd_table[]).
m->mq_mess.COUNT is the number of bytes requested for the read or write operation.
It is very important to understand that if udp_read(), udp_write(), and udp_ioctl() are able to satisfy their requests, they will call either sr_put_userdata() (for udp_read()) or sr_get_userdata() (for udp_write() and udp_ioctl()). Understanding the role that sr_put_userdata() and sr_get_userdata() play is probably the largest challenge in understanding the sr.c file.
0015324 {
0015325 case DEV_READ:
0015326 r= (*sr_fd->srf_read)(sr_fd->srf_fd,
0015327 m->mq_mess.COUNT);
0015328 break;
0015329 case DEV_WRITE:
0015330 r= (*sr_fd->srf_write)(sr_fd->srf_fd,
0015331 m->mq_mess.COUNT);
0015332 break;
0015333 case DEV_IOCTL:
0015334 request= m->mq_mess.REQUEST;
0015335 #ifdef _IOCPARM_MASK
_IOCPARM_MASK is #define'd in include/sys/ioctl.h as:
#define _IOCPARM_MASK 0x1FFF
/* Ioctls have the command encoded in the low-order word, and the size
* of the parameter in the high-order word. The 3 high bits of the high-
* order word are used to encode the in/out/void status of the parameter.
*/
size is used to determine if request contains a valid ioctl request.
An example of valid request is NWIOSUDPOPT, which is found in include/sys/ioctl.h:
#define NWIOSUDPOPT _IOW('n', 64, struct nwio_udpopt)
#define _IOW(x,y,t) ((x << 8) | y | ((sizeof(t) & _IOCPARM_MASK) << 16) |\
_IOC_IN)
#define _IOCPARM_MASK 0x1FFF
#define _IOC_IN 0x40000000
Therefore,
NWIOSUDOPT = _IOW('n', 64, struct nwio_udpopt) = _IOW(0x6E, 0x40, struct nwio_udpopt) = (0x6E<<8 | 0x40 | (sizeof(nwio_udpopt) & 0x1FFF) << 16 | 0x4000000)
= ((0x6E<<8 | 0x40 | (20 & 0x1FFF) << 16 | 0x4000000)
= 0x40146E40
0015336 size= (request >> 16) & _IOCPARM_MASK;
In the example above:
size = (0x40146E40 >> 16) & 0x1FFF = 0x14 = 20
Note that 20 = sizeof(nwio_udpopt)
0015337 if (size>MAX_IOCTL_S)
MAX_IOCTL_S is declared in inet/generic/sr.h as:
#define MAX_IOCTL_S 512
This ensures that the struct passed from the user process to the network service is not greater than 512 bytes. For the NWIOSUDOPT request, size is 20 bytes and therefore acceptable.
0015338 {
0015339 DBLOCK(1, printf("replying EINVAL\n"));
0015340 r= sr_put_userdata(sr_fd-sr_fd_table, EINVAL,
0015341 NULL, 1);
Send an error message (EINVAL stands for "Error INVALid") to the user process.
sr_fd-sr_fd_table is the index of the channel.
0015342 assert(r == OK);
0015343 return OK;
0015344 }
0015345 #endif
0015346 r= (*sr_fd->srf_ioctl)(sr_fd->srf_fd, request);
Call the channel's protocol-specific ioctl function (e.g., udp_ioctl()). sr_fd->srf_fd is the slot in the underlying protocol's file descriptor table (e.g., udp_fd_table[]).
0015347 break;
0015348 #if !CRAMPED
0015349 default:
0015350 ip_panic(("illegal case entry"));
0015351 #endif
0015352 }
0015353
0015354 assert(r == OK || r == SUSPEND ||
0015355 (printf("r= %d\n", r), 0));
0015356 if (r == SUSPEND)
0015357 sr_fd->srf_flags |= susp_flag;
If the result of the operation is SUSPEND, then set the specific flag (e.g., SFF_READ_SUSP) and return SUSPEND. SUSPEND is #define'd in /include/minix/com.h:
# define SUSPEND -998 /* used in interrupts when tty has no data */
An example of a suspended operation is a read operation that is requested when there is no data waiting.
The suspend flag (as well as the ip flag) is ultimately cleared on line 15591 when data becomes available in the user process (for a write or ioctl operation) or from the ethernet buffer (for a read or ioctl operation).
0015358 return r;
0015359 }
0015360
0015361 PRIVATE int sr_cancel(m)
0015362 message *m;
sr_cancel()
Unlike sr_open(), sr_close(), and sr_rwio(), sr_cancel() does not call the protocol specific cancel function directly. Instead, sr_cancel() calls walk_queue(), which walks the message queue for the given channel (sr_fd) for each of the queues (read, write, and ioctl) until it finds a message with the same process number.
If the message is the first message in the queue, walk_queue() calls the protocol specific cancel function (see line 15422). If a message is found but it is not the first message in the queue, walk_queue() removes the message from the queue and returns EINTR. If a message is not found, walk_queue() returns EAGAIN (which is a problem).
0015363 {
0015364 sr_fd_t *sr_fd;
0015365 int i, result;
0015366 mq_t *q_ptr, *q_ptr_prv;
0015367 int proc_nr, ref, operation;
0015368
0015369 result=EINTR;
EINTR error number is #define'd in include/errno.h as:
#define EINTR (_SIGN 4) /* interrupted function call */
It is unclear why result is set here. It will be reset later in the function.
0015370 proc_nr= m->PROC_NR;
0015371 ref= 0;
0015372 operation= 0;
0015373 sr_fd= sr_getchannel(m->DEVICE);
sr_getchannel()
For a given channel (i.e., an index within sr_fd_table[]), sr_getchannel(minor) returns a pointer to the corresponding sr_fd element.
Note that minor, sr_getchannel()'s only parameter, does not correspond to a minor device. Instead, minor corresponds to a normal slot within sr_fd_table[].
0015374 assert (sr_fd);
0015375
0015376 {
0015377 result= walk_queue(sr_fd, sr_fd->srf_ioctl_q,
0015378 &sr_fd->srf_ioctl_q_tail, SR_CANCEL_IOCTL,
0015379 proc_nr, ref);
walk_queue()
For a given channel (i.e., for a given sr_fd) in sr_fd_table[], walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref) finds a message in the appropriate queue (srf_read_q, srf_write_q or srf_ioctl_q) for the given process number (proc_nr). If a message with this process number is found and the message is the first in the queue, call the protocol specific cancel function (e.g., udp_cancel()) and return OK. If a message with this process number is found and it is not the first in the queue, simply remove the message from the queue and return EINTR. If a message with the given process number can't be found in the queue, return EAGAIN. The difference between the first message and any other message in the queue is that the first message is already being processed and therefore some extra cleanup must be done.
ref, the last parameter of walk_queue(), is never used.
0015380 if (result != EAGAIN)
0015381 return result;
0015382 }
0015383 {
0015384 result= walk_queue(sr_fd, sr_fd->srf_read_q,
0015385 &sr_fd->srf_read_q_tail, SR_CANCEL_READ,
0015386 proc_nr, ref);
walk_queue()
For a given channel (i.e., for a given sr_fd) in sr_fd_table[], walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref) finds a message in the appropriate queue (srf_read_q, srf_write_q or srf_ioctl_q) for the given process number (proc_nr). If a message with this process number is found and the message is the first in the queue, call the protocol specific cancel function (e.g., udp_cancel()) and return OK. If a message with this process number is found and it is not the first in the queue, simply remove the message from the queue and return EINTR. If a message with the given process number can't be found in the queue, return EAGAIN. The difference between the first message and any other message in the queue is that the first message is already being processed and therefore some extra cleanup must be done.
ref, the last parameter of walk_queue(), is never used.
0015387 if (result != EAGAIN)
0015388 return result;
0015389 }
0015390 {
0015391 result= walk_queue(sr_fd, sr_fd->srf_write_q,
0015392 &sr_fd->srf_write_q_tail, SR_CANCEL_WRITE,
0015393 proc_nr, ref);
walk_queue()
For a given channel (i.e., for a given sr_fd) in sr_fd_table[], walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref) finds a message in the appropriate queue (srf_read_q, srf_write_q or srf_ioctl_q) for the given process number (proc_nr). If a message with this process number is found and the message is the first in the queue, call the protocol specific cancel function (e.g., udp_cancel()) and return OK. If a message with this process number is found and it is not the first in the queue, simply remove the message from the queue and return EINTR. If a message with the given process number can't be found in the queue, return EAGAIN. The difference between the first message and any other message in the queue is that the first message is already being processed and therefore some extra cleanup must be done.
ref, the last parameter of walk_queue(), is never used.
0015394 if (result != EAGAIN)
0015395 return result;
0015396 }
0015397 #if !CRAMPED
0015398 ip_panic((
0015399 "request not found: from %d, type %d, MINOR= %d, PROC= %d, REF= %d OPERATION= %d",
0015400 m->m_source, m->m_type, m->DEVICE,
0015401 m->PROC_NR, 0, 0));
0015402 #endif
0015403 }
0015404
0015405 PRIVATE int walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref)
0015406 sr_fd_t *sr_fd;
0015407 mq_t *q_head, **q_tail_ptr;
0015408 int type;
0015409 int proc_nr;
0015410 int ref;
walk_queue()
For a given channel (i.e., for a given sr_fd) in sr_fd_table[], walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref) finds a message in the appropriate queue (srf_read_q, srf_write_q or srf_ioctl_q) for the given process number (proc_nr). If a message with this process number is found and the message is the first in the queue, call the protocol specific cancel function (e.g., udp_cancel()) and return OK. If a message with this process number is found and it is not the first in the queue, simply remove the message from the queue and return EINTR. If a message with the given process number can't be found in the queue, return EAGAIN. The difference between the first message and any other message in the queue is that the first message is already being processed and therefore some extra cleanup must be done.
ref, the last parameter of walk_queue(), is never used.
0015411 {
0015412 mq_t *q_ptr_prv, *q_ptr;
0015413 int result;
0015414
0015415 for(q_ptr_prv= NULL, q_ptr= q_head; q_ptr;
0015416 q_ptr_prv= q_ptr, q_ptr= q_ptr->mq_next)
0015417 {
0015418 if (q_ptr->mq_mess.PROC_NR != proc_nr)
0015419 continue;
Check whether the process number of the message in the queue is the same as the process number of the message being cancelled.
0015420 if (!q_ptr_prv)
0015421 {
0015422 result= (*sr_fd->srf_cancel)(sr_fd->srf_fd, type);
0015423 assert(result == OK);
0015424 return OK;
0015425 }
0015426 q_ptr_prv->mq_next= q_ptr->mq_next;
The message is not the first message in the queue.
There is a distinction between an operation that has been started and has been
interrupted and an operation that is still on the message queue and has been interrupted. In the first case, the call to the protocol specific cancel function causes a reply to be sent to the file system. In the second case, sr_rec() (line 15122) must send the reply.
Remove the message from the queue. If the message is the last message in the queue, set the tail of the queue to point to the previous message.
0015427 mq_free(q_ptr);
0015428 if (!q_ptr_prv->mq_next)
0015429 *q_tail_ptr= q_ptr_prv;
0015430 return EINTR;
A return value of OK (line 15424) means that sr_rec() (line 15211) doesn't have to do anything (srf_cancel did the work). EINTR means that sr_rec() must send a result (EINTR) to the file system.
0015431 }
0015432 return EAGAIN;
0015433 }
0015434
0015435 PRIVATE sr_fd_t *sr_getchannel(minor)
sr_getchannel()
For a given channel (i.e., an index within sr_fd_table[]), sr_getchannel(minor) returns a pointer to the corresponding sr_fd element.
Note that minor, sr_getchannel()'s only parameter, does not correspond to a minor device. Instead, minor corresponds to a normal slot within sr_fd_table[].
0015436 int minor;
0015437 {
0015438 sr_fd_t *loc_fd;
0015439
Verify that minor is an acceptable value (0 <= minor <= 63). If the slot number is valid, return a pointer to the slot (i.e., the corresponding sr_fd struct).
0015440 compare(minor, >=, 0);
compare()
compare is #define'd in inet/generic/assert.h:
#define compare(a,t,b) (!((a) t (b)) ? bad_compare(this_file, __LINE__, \
(a), #a " " #t " " #b, (b)) : (void) 0)
and bad_compare() is defined in inet/inet.c.
If the relationship between the 3 arguments in compare() does not hold, some debugging output is emitted and then Minix is terminated.
For example, if compare(result, >=, 0) is called and result (the first argument) is -1, Minix will be terminated.
0015441 compare(minor, <, FD_NR);
compare()
compare is #define'd in inet/generic/assert.h:
#define compare(a,t,b) (!((a) t (b)) ? bad_compare(this_file, __LINE__, \
(a), #a " " #t " " #b, (b)) : (void) 0)
and bad_compare() is defined in inet/inet.c.
If the relationship between the 3 arguments in compare() does not hold, some debugging output is emitted and then Minix is terminated.
For example, if compare(result, >=, 0) is called and result (the first argument) is -1, Minix will be terminated.
0015442
0015443 loc_fd= &sr_fd_table[minor];
0015444
0015445 assert (!(loc_fd->srf_flags & SFF_MINOR) &&
0015446 (loc_fd->srf_flags & SFF_INUSE));
0015447
0015448 return loc_fd;
0015449 }
0015450
0015451 PRIVATE void sr_reply (mq, status, can_enqueue)
0015452 mq_t *mq;
0015453 int status;
0015454 int can_enqueue;
sr_reply()
sr_reply(mq, status, can_enqueue) attempts to send a REVIVE message back to the file system (FS) (which is what the FS expects to receive). sr_reply() is called from sr_rec() and from sr_get_userdata() and sr_put_userdata().
If called from sr_rec(), there are two scenarios. If the message mq (sr_reply()'s first parameter) was a request for a DEV_OPEN, DEV_CLOSE, or CANCEL, a reply is guaranteed to succeed and therefore there is no need to enqueue the message in repl_queue. If the message mq was a request for a DEV_READ, DEV_WRITE, or DEV_IOCTL operation that was suspended, a reply is also guaranteed to succeed. In both of these scenarios, it is not necessary to queue the message in repl_queue and, therefore, can_enqueue is FALSE if called from sr_rec().
However, if sr_reply() is called from sr_get_userdata() or sr_put_userdata(), the message may fail due to deadlock. Also, the message must eventually reach the file system so that the user process may be unblocked. Therefore, the reply message must be placed in repl_queue if the message doesn't succeed due to deadlock. For this reason, can_enqueue is TRUE if called from sr_get_userdata() or sr_put_userdata().
It is important to understand why deadlocks can occur. During the initialization of the network service, the network service calls svrctl() (which calls do_svrctl()) to initialize itself with the FS. do_svrctl() sets the fields of the network service's dmap[] entry to gen_io() for read, write, and ioctl requests and clone_opcl() for open and close requests. For this reason, whenever a read, write, or ioctl requests for a network file is received by the file system, gen_io() is called and whenever an open or close request is received by the file system, clone_opcl() is called (note that clone_opcl() calls gen_io()).
If gen_io() calls sendrec() to send a message to the network service and there is a deadlock because the the network service also attempted to send a message to the file system (i.e., sendrec() returned ELOCKED), gen_io() will call receive() instead.
If the read, write, or ioctl request could not be immediately satisfied, the network service will later send() another REVIVE message after the read, write, or ioctl operation has completed (successfully or unsuccessfully). However, the file system will not necessarily handle any deadlock that occurs (since
the main loop in the file system may possibly handle the message) and so, if a deadlock occurs, the network service must later attempt to send the REVIVE message that was not successful due to the deadlock. The network service does this by placing the message in the repl_queue. The messages in repl_queue will be sent again the next time that sr_repl_queue() is called.
0015455 {
0015456 int result, proc, ref,operation;
0015457 message reply, *mp;
0015458
0015459 proc= mq->mq_mess.PROC_NR;
0015460 ref= 0;
0015461 operation= mq->mq_mess.m_type;
operation will be one of the following: DEV_OPEN, DEV_CLOSE, DEV_READ, DEV_WRITE, DEV_IOCTL, or CANCEL.
0015462
0015463 if (can_enqueue)
If the message fails due to deadlock but should be retried later (i.e., should be placed in repl_queue), can_enqueue is TRUE. For an explanation of when can_enqueue is TRUE and when can_enqueue is FALSE, see the comment for line 15454.
0015464 mp= &mq->mq_mess;
0015465 else
0015466 mp= &reply;
0015467
0015468 mp->m_type= REVIVE;
A message of type REVIVE 'revives' a previously blocked process. Other than cancel messages, REVIVE is the only type of message that the file system expects from the network service.
REVIVE is #define'd in include/minix/callnr.h:
#define REVIVE 67 /* to FS: revive a sleeping process */
0015469 mp->REP_PROC_NR= proc;
0015470 mp->REP_STATUS= status;
status (the second parameter to sr_reply()) can have many different values. For example, if a read/write/ioctl operation was suspended, status will equal SUSPEND (see line 15156). If an open operation was requested and the operation was successful, status will be the resulting file descriptor (for example, a udp file descriptor).
0015471 result= send(mq->mq_mess.m_source, mp);
0015472 if (result == ELOCKED && can_enqueue)
It is possible that the file system is also attempting to send the network server a message. This will result in a deadlock. For a detailed description of deadlocks and how they are handled, read the comment on line 15454.
0015473 {
0015474 if (repl_queue)
Place the message at the tail of the queue (which may also be the head of the queue).
repl_queue / repl_queue_tail
repl_queue is the head of the linked list (repl_queue_tail is the tail) of REVIVE messages that had been previously (unsuccessfully) sent to the file system (FS) but had not been delivered due to deadlock. Before any new messages are processed by the network service, a second (and last) attempt is made by send_repl_queue() to send out these REVIVE messages in repl_queue.
For a detailed description of what circumstances lead to a deadlock, click here.
0015475 repl_queue_tail->mq_next= mq;
0015476 else
0015477 repl_queue= mq;
0015478 repl_queue_tail= mq;
In addition to adding a message to repl_queue, the message's mq_next field should be set to NULL (since it is the tail of the queue). Future versions of the network service will fix this.
0015479 return;
0015480 }
0015481 if (result != OK)
0015482 ip_panic(("unable to send"));
0015483 if (can_enqueue)
0015484 mq_free(mq);
mq_free()
mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.
Click here for a detailed description of mq_list[] and mq_freelist.
0015485 }
0015486
0015487 PRIVATE acc_t *sr_get_userdata (fd, offset, count, for_ioctl)
0015488 int fd;
0015489 vir_bytes offset;
0015490 vir_bytes count;
0015491 int for_ioctl;
sr_get_userdata()
sr_get_userdata() is the counterpart to sr_put_userdata() and does one of two things:
1) Copies data from a user process to a buffer (to be more specific, a chain of accessors) within the network service (this process). This can be either ioctl data (in which case, for_ioctl is TRUE) or data. For example, udp_setopt() (indirectly) calls sr_get_userdata() to get configuration data. Also, restart_write_fd() (indirectly) calls sr_get_userdata() before passing data onto the ip code.
2) Sends a REVIVE message to the file system (FS). For example, if an illegal option is selected while configuring a udp file descriptor, reply_thr_get() is called, which then (indirectly) calls sr_get_userdata(), passing in EBADMODE for the parameter count. restart_write_fd() also (indirectly) calls sr_get_userdata() to send a REVIVE message back to the FS indicating the number of bytes read after copying the data from the user process.
sr_get_userdata() is often called twice in close succession. The first time to attempt to copy the data from the user process and then the second time to send a message to the FS indicating whether the copy operation was successful and, if it was successful, the number of bytes copied.
In my opinion, like sr_put_userdata(), this function should have been made into two functions. As it is, it is too confusing.
0015492 {
0015493 sr_fd_t *loc_fd;
0015494 mq_t **head_ptr, **tail_ptr, *m, *tail, *mq;
0015495 int ip_flag, susp_flag;
0015496 int result;
0015497 int suspended;
0015498 char *src;
0015499 acc_t *acc;
0015500
0015501 loc_fd= &sr_fd_table[fd];
0015502
sr_get_userdata() is called for ioctl and write requests. The following if/else statement chooses the appropriate message queues and flags for the requested operation.
It is very important to note that the messages in the srf_ioctl_q and srf_write_q queues were placed there by sr_rwio() on lines 15309-15320.
0015503 if (for_ioctl)
0015504 {
0015505 head_ptr= &loc_fd->srf_ioctl_q;
0015506 tail_ptr= &loc_fd->srf_ioctl_q_tail;
0015507 ip_flag= SFF_IOCTL_IP;
0015508 susp_flag= SFF_IOCTL_SUSP;
0015509 }
0015510 else
0015511 {
0015512 head_ptr= &loc_fd->srf_write_q;
0015513 tail_ptr= &loc_fd->srf_write_q_tail;
0015514 ip_flag= SFF_WRITE_IP;
0015515 susp_flag= SFF_WRITE_SUSP;
0015516 }
0015517
0015518 assert (loc_fd->srf_flags & ip_flag);
0015519
0015520 if (!count)
If count (the third parameter of sr_get_userdata()) is zero, sr_get_userdata() sends a message to the file system (by calling sr_reply()) and then processes the messages that couldn't be delivered previously (by calling process_req_q()).
The variable m is the first message in the queue and mq is the second message in the queue. Note that it is more than likely that the network service has serviced the request of message m. For example, if message m was a request for a write operation, a message was sent (by cp_u2b()) to SYSTASK requesting a data transfer from a user process. (Note that, although unlikely, the call to cp_u2b() may not have been successful.) Only a message to the FS indicating whether the operation was successful must now be sent. This is not true for the rest of the messages in the queue, which still must be processed.
0015521 {
0015522 m= *head_ptr;
0015523 *head_ptr= NULL;
0015524 tail= *tail_ptr;
0015525 assert(m);
0015526 mq= m->mq_next;
Note that mq is the second message in the queue. m is the first message and an attempt has already been made to process the message. sr_reply() will send a message indicating whether the attempt was successful.
0015527 result= (int)offset;
0015528 sr_reply (m, result, 1);
sr_reply()
sr_reply(mq, status, can_enqueue) attempts to send a REVIVE message back to the file system (FS) (which is what the FS expects to receive). sr_reply() is called from sr_rec() and from sr_get_userdata() and sr_put_userdata().
If called from sr_rec(), there are two scenarios. If the message mq (sr_reply()'s first parameter) was a request for a DEV_OPEN, DEV_CLOSE, or CANCEL, a reply is guaranteed to succeed and therefore there is no need to enqueue the message in repl_queue. If the message mq was a request for a DEV_READ, DEV_WRITE, or DEV_IOCTL operation that was suspended, a reply is also guaranteed to succeed. In both of these scenarios, it is not necessary to queue the message in repl_queue and, therefore, can_enqueue is FALSE if called from sr_rec().
However, if sr_reply() is called from sr_get_userdata() or sr_put_userdata(), the message may fail due to deadlock. Also, the message must eventually reach the file system so that the user process may be unblocked. Therefore, the reply message must be placed in repl_queue if the message doesn't succeed due to deadlock. For this reason, can_enqueue is TRUE if called from sr_get_userdata() or sr_put_userdata().
It is important to understand why deadlocks can occur. During the initialization of the network service, the network service calls svrctl() (which calls do_svrctl()) to initialize itself with the FS. do_svrctl() sets the fields of the network service's dmap[] entry to gen_io() for read, write, and ioctl requests and clone_opcl() for open and close requests. For this reason, whenever a read, write, or ioctl requests for a network file is received by the file system, gen_io() is called and whenever an open or close request is received by the file system, clone_opcl() is called (note that clone_opcl() calls gen_io()).
If gen_io() calls sendrec() to send a message to the network service and there is a deadlock because the the network service also attempted to send a message to the file system (i.e., sendrec() returned ELOCKED), gen_io() will call receive() instead.
If the read, write, or ioctl request could not be immediately satisfied, the network service will later send() another REVIVE message after the read, write, or ioctl operation has completed (successfully or unsuccessfully). However, the file system will not necessarily handle any deadlock that occurs (since
the main loop in the file system may possibly handle the message) and so, if a deadlock occurs, the network service must later attempt to send the REVIVE message that was not successful due to the deadlock. The network service does this by placing the message in the repl_queue. The messages in repl_queue will be sent again the next time that sr_repl_queue() is called.
0015529 suspended= (loc_fd->srf_flags & susp_flag);
0015530 loc_fd->srf_flags &= ~(ip_flag|susp_flag);
0015531 if (suspended)
0015532 {
0015533 process_req_q(mq, tail, tail_ptr);
process_req_q()
If a message arrives but the message's request cannot be satisifed (for example, because there was no data available to be read), the message is put into a queue (e.g., srf_read_q). When the request can finally be processed, the first message in the queue is processed and then process_req_q() is called to process all of the remaining messages in the queue. process_req_q() does this by calling sr_rwio() repeatedly until either there are no remaining messages in the queue or sr_rwio() returns SUSPEND. (For example, sr_rwio() will return SUSPEND if there is not enough data to satisfy a read request.)
process_req_q() has three parameters:
mq - the first message in a linked list to be processed
tail - the address of the last message in the linked list
tail_ptr - the address of the relevant field of the file descriptor. For example, the relevant field for a read request will be the srf_read_q_tail field.
To see process_req_q() in action, consider the following example where messages requesting reads arrive yet there is no data waiting to be read:
Message 1 arrives, requesting a read operation. sr_rec() calls sr_rwio(), which checks whether the read flag was previously set. The read flag is not set, indicating that there has not been a previously unfulfilled read request. Therefore, the message is placed in the read queue (it becomes the head and the tail of the queue) and the protocol-specific read function is called (e.g., udp_read()). There is no data waiting to be read by the user process so the request is suspended (i.e., the suspend flag for the file descriptor is raised). Four further messages (messages 2-5) arrive and are placed at the end of the read queue. At this point, the queue is as follows:
Finally, data arrives. As a result, sr_put_userdata() is called twice. The first time sr_put_userdata() is called, the data requested by message 1 is copied to the user process. The second time sr_put_userdata() is called, sr_reply() is called, which sends a "REVIVE" message back to the FS specifying how many bytes were copied. Message 1 is then taken off the queue. At this point, the queue is as follows:
After sr_reply() returns, sr_put_userdata() calls process_req_q(), with a pointer to message 2 as the first argument, a pointer to message 5 as the second argument, and a pointer to the srf_read_q_tail field of the sr file descriptor as the third argument. process_req_q() repeatedly calls sr_rwio() until there are either no more messages or there is no more data available to satisfy the remaining requests. Let's assume that there is enough data to satisfy the first 2 messages. After satisfying the first two messages, the queue is as follows:
0015534 }
0015535 else
0015536 {
0015537 assert(!mq);
0015538 }
0015539 return NULL;
0015540 }
0015541
0015542 src= (*head_ptr)->mq_mess.ADDRESS + offset;
0015543 result= cp_u2b ((*head_ptr)->mq_mess.PROC_NR, src, &acc, count);
cp_u2b()
cp_u2b() copies size (the fourth parameter) bytes from an address in a user process (src, the second parameter) to a linked list of buffers within the network service. The address of the first buffer in the linked list is returned through the third parameter. The process number of the user process is the first parameter and is a field in the message requesting the copy.
cp_u2b() is called only by sr_get_userdata().
"u2b" stands for "user to buffer".
0015544
0015545 return result<0 ? NULL : acc;
0015546 }
0015547
0015548 PRIVATE int sr_put_userdata (fd, offset, data, for_ioctl)
0015549 int fd;
0015550 vir_bytes offset;
0015551 acc_t *data;
0015552 int for_ioctl;
sr_put_userdata()
sr_put_userdata(fd, offset, data, for_ioctl) is the counterpart to sr_get_userdata() and (like sr_get_userdata()) does one of two things:
1) Copies data from a buffer (to be more specific, a chain of accessors) within the network service (this process) to a buffer within the user process. This can be either ioctl data (in which case, for_ioctl is TRUE) or read/write data (for_ioctl is FALSE). For example, udp_ioctl() (indirectly) calls sr_put_userdata() to give configuration data to a user process. Also, udp_packet2user() (indirectly) calls sr_get_userdata() to pass data to the user process.
2) Sends a message to the FS. For example, if a read is attempted on a udp file descriptor before the file descriptor is configured, reply_thr_put() is called, which then (indirectly) calls sr_put_userdata(), passing in EBADMODE for the parameter count.
In my opinion, like sr_get_userdata(), this should have been made into two functions. As it is, it is too confusing.
0015553 {
0015554 sr_fd_t *loc_fd;
0015555 mq_t **head_ptr, **tail_ptr, *m, *tail, *mq;
0015556 int ip_flag, susp_flag;
0015557 int result;
0015558 int suspended;
0015559 char *dst;
0015560
0015561 loc_fd= &sr_fd_table[fd];
sr_fd_table[] is an array that contains the sr file descriptors. For a complete description of sr_fd_table[], click here.
0015562
0015563 if (for_ioctl)
sr_put_userdata() is called for two different requests: ioctl and read. The following if/else statement chooses the appropriate queues and flags for the requested operation.
The messages in the srf_ioctl_q and srf_read_q queues were placed there by sr_rwio() on lines 15309-15320.
0015564 {
0015565 head_ptr= &loc_fd->srf_ioctl_q;
0015566 tail_ptr= &loc_fd->srf_ioctl_q_tail;
0015567 ip_flag= SFF_IOCTL_IP;
0015568 susp_flag= SFF_IOCTL_SUSP;
0015569 }
0015570 else
0015571 {
0015572 head_ptr= &loc_fd->srf_read_q;
0015573 tail_ptr= &loc_fd->srf_read_q_tail;
0015574 ip_flag= SFF_READ_IP;
0015575 susp_flag= SFF_READ_SUSP;
0015576 }
0015577
0015578 assert (loc_fd->srf_flags & ip_flag);
0015579
0015580 if (!data)
If data, sr_put_userdata()'s third parameter, is zero, sr_put_userdata() sends a message to the file system (by calling sr_reply()) and then processes the messages that couldn't be delivered previously (by calling process_req_q()).
The variable m is the first message in the queue and mq is the second message in the queue. Note that it is more than likely that the network service has serviced the request of message m. For example, if message m was a request for a read operation, a message was sent (by cp_b2u()) to SYSTASK requesting data from a user process. (Note that, although unlikely, the call to cp_u2b() may not have been successful.) Only a message to the FS indicating whether the operation was successful must now be sent. This is not true for the rest of the messages in the queue, which still must be processed.
0015581 {
0015582 m= *head_ptr;
0015583 assert(m);
0015584
0015585 *head_ptr= NULL;
0015586 tail= *tail_ptr;
0015587 mq= m->mq_next;
Note that mq is the second message in the queue. m is the first message and will be handled by sr_reply() below.
0015588 result= (int)offset;
0015589 sr_reply (m, result, 1);
sr_reply()
sr_reply(mq, status, can_enqueue) attempts to send a REVIVE message back to the file system (FS) (which is what the FS expects to receive). sr_reply() is called from sr_rec() and from sr_get_userdata() and sr_put_userdata().
If called from sr_rec(), there are two scenarios. If the message mq (sr_reply()'s first parameter) was a request for a DEV_OPEN, DEV_CLOSE, or CANCEL, a reply is guaranteed to succeed and therefore there is no need to enqueue the message in repl_queue. If the message mq was a request for a DEV_READ, DEV_WRITE, or DEV_IOCTL operation that was suspended, a reply is also guaranteed to succeed. In both of these scenarios, it is not necessary to queue the message in repl_queue and, therefore, can_enqueue is FALSE if called from sr_rec().
However, if sr_reply() is called from sr_get_userdata() or sr_put_userdata(), the message may fail due to deadlock. Also, the message must eventually reach the file system so that the user process may be unblocked. Therefore, the reply message must be placed in repl_queue if the message doesn't succeed due to deadlock. For this reason, can_enqueue is TRUE if called from sr_get_userdata() or sr_put_userdata().
It is important to understand why deadlocks can occur. During the initialization of the network service, the network service calls svrctl() (which calls do_svrctl()) to initialize itself with the FS. do_svrctl() sets the fields of the network service's dmap[] entry to gen_io() for read, write, and ioctl requests and clone_opcl() for open and close requests. For this reason, whenever a read, write, or ioctl requests for a network file is received by the file system, gen_io() is called and whenever an open or close request is received by the file system, clone_opcl() is called (note that clone_opcl() calls gen_io()).
If gen_io() calls sendrec() to send a message to the network service and there is a deadlock because the the network service also attempted to send a message to the file system (i.e., sendrec() returned ELOCKED), gen_io() will call receive() instead.
If the read, write, or ioctl request could not be immediately satisfied, the network service will later send() another REVIVE message after the read, write, or ioctl operation has completed (successfully or unsuccessfully). However, the file system will not necessarily handle any deadlock that occurs (since
the main loop in the file system may possibly handle the message) and so, if a deadlock occurs, the network service must later attempt to send the REVIVE message that was not successful due to the deadlock. The network service does this by placing the message in the repl_queue. The messages in repl_queue will be sent again the next time that sr_repl_queue() is called.
0015590 suspended= (loc_fd->srf_flags & susp_flag);
0015591 loc_fd->srf_flags &= ~(ip_flag|susp_flag);
0015592 if (suspended)
0015593 {
0015594 process_req_q(mq, tail, tail_ptr);
process_req_q()
If a message arrives but the message's request cannot be satisifed (for example, because there was no data available to be read), the message is put into a queue (e.g., srf_read_q). When the request can finally be processed, the first message in the queue is processed and then process_req_q() is called to process all of the remaining messages in the queue. process_req_q() does this by calling sr_rwio() repeatedly until either there are no remaining messages in the queue or sr_rwio() returns SUSPEND. (For example, sr_rwio() will return SUSPEND if there is not enough data to satisfy a read request.)
process_req_q() has three parameters:
mq - the first message in a linked list to be processed
tail - the address of the last message in the linked list
tail_ptr - the address of the relevant field of the file descriptor. For example, the relevant field for a read request will be the srf_read_q_tail field.
To see process_req_q() in action, consider the following example where messages requesting reads arrive yet there is no data waiting to be read:
Message 1 arrives, requesting a read operation. sr_rec() calls sr_rwio(), which checks whether the read flag was previously set. The read flag is not set, indicating that there has not been a previously unfulfilled read request. Therefore, the message is placed in the read queue (it becomes the head and the tail of the queue) and the protocol-specific read function is called (e.g., udp_read()). There is no data waiting to be read by the user process so the request is suspended (i.e., the suspend flag for the file descriptor is raised). Four further messages (messages 2-5) arrive and are placed at the end of the read queue. At this point, the queue is as follows:
Finally, data arrives. As a result, sr_put_userdata() is called twice. The first time sr_put_userdata() is called, the data requested by message 1 is copied to the user process. The second time sr_put_userdata() is called, sr_reply() is called, which sends a "REVIVE" message back to the FS specifying how many bytes were copied. Message 1 is then taken off the queue. At this point, the queue is as follows:
After sr_reply() returns, sr_put_userdata() calls process_req_q(), with a pointer to message 2 as the first argument, a pointer to message 5 as the second argument, and a pointer to the srf_read_q_tail field of the sr file descriptor as the third argument. process_req_q() repeatedly calls sr_rwio() until there are either no more messages or there is no more data available to satisfy the remaining requests. Let's assume that there is enough data to satisfy the first 2 messages. After satisfying the first two messages, the queue is as follows:
0015595 }
0015596 else
0015597 {
0015598 assert(!mq);
0015599 }
0015600 return OK;
0015601 }
0015602
0015603 dst= (*head_ptr)->mq_mess.ADDRESS + offset;
0015604 return cp_b2u (data, (*head_ptr)->mq_mess.PROC_NR, dst);
cp_b2u()
cp_u2b(acc_ptr, proc, dest) copies all of the data in the accessor linked list acc_ptr, cp_u2b()'s first buffer, to a location dest, cp_u2b()'s third parameter, in the user process proc, cp_b2u()'s third parameter.
cp_u2b() is called only by sr_put_userdata().
"b2u" stands for "buffer to user".
0015605 }
0015606
0015607 PRIVATE void process_req_q(mq, tail, tail_ptr)
0015608 mq_t *mq, *tail, **tail_ptr;
process_req_q()
If a message arrives but the message's request cannot be satisifed (for example, because there was no data available to be read), the message is put into a queue (e.g., srf_read_q). When the request can finally be processed, the first message in the queue is processed and then process_req_q() is called to process all of the remaining messages in the queue. process_req_q() does this by calling sr_rwio() repeatedly until either there are no remaining messages in the queue or sr_rwio() returns SUSPEND. (For example, sr_rwio() will return SUSPEND if there is not enough data to satisfy a read request.)
process_req_q() has three parameters:
mq - the first message in a linked list to be processed
tail - the address of the last message in the linked list
tail_ptr - the address of the relevant field of the file descriptor. For example, the relevant field for a read request will be the srf_read_q_tail field.
To see process_req_q() in action, consider the following example where messages requesting reads arrive yet there is no data waiting to be read:
Message 1 arrives, requesting a read operation. sr_rec() calls sr_rwio(), which checks whether the read flag was previously set. The read flag is not set, indicating that there has not been a previously unfulfilled read request. Therefore, the message is placed in the read queue (it becomes the head and the tail of the queue) and the protocol-specific read function is called (e.g., udp_read()). There is no data waiting to be read by the user process so the request is suspended (i.e., the suspend flag for the file descriptor is raised). Four further messages (messages 2-5) arrive and are placed at the end of the read queue. At this point, the queue is as follows:
Finally, data arrives. As a result, sr_put_userdata() is called twice. The first time sr_put_userdata() is called, the data requested by message 1 is copied to the user process. The second time sr_put_userdata() is called, sr_reply() is called, which sends a "REVIVE" message back to the FS specifying how many bytes were copied. Message 1 is then taken off the queue. At this point, the queue is as follows:
After sr_reply() returns, sr_put_userdata() calls process_req_q(), with a pointer to message 2 as the first argument, a pointer to message 5 as the second argument, and a pointer to the srf_read_q_tail field of the sr file descriptor as the third argument. process_req_q() repeatedly calls sr_rwio() until there are either no more messages or there is no more data available to satisfy the remaining requests. Let's assume that there is enough data to satisfy the first 2 messages. After satisfying the first two messages, the queue is as follows:
0015609 {
0015610 mq_t *m;
0015611 int result;
0015612
0015613 for(;mq;)
Process the linked list of messages by calling sr_rwio() for each message. Quit after all of the messages have been processed by sr_rwio() or when sr_rwio() returns SUSPEND (sr_rwio() will return SUSPEND, for example, if there is not enough data to satisfy a read request).
0015614 {
0015615 m= mq;
0015616 mq= mq->mq_next;
0015617
0015618 DBLOCK(1, printf("calling rwio\n"));
0015619
0015620 result= sr_rwio(m);
sr_rwio()
sr_rwio() is either called by sr_rec() upon receipt of a read, write, or ioctl request or by process_req_q() to handle queued messages. The handling of a read, write, or ioctl request is a complex operation that is best explained by an example. We will analyze two different read requests, one with data ready to be delivered to user processes and one with no data ready to be delivered.
SCENARIO 1: There is data ready to be delivered to the user process and there are no messages in the read queue (i.e., srf_read_q, srf_read_q_tail == null). The file descriptor was opened by the udp client.
When a message arrives requesting a read operation (i.e., m->mq_mess.m_type == DEV_READ), udp_read() is called, which in turn calls udp_packet2user(). udp_packet2user() calls sr_put_userdata() twice. The first time sr_put_userdata() is called, the data is transferred to the user process. The second time sr_put_userdata() is called, a reply is sent to the user process by sr_reply(), which ultimately calls mq_free(), freeing the message.
SCENARIO 2: There are no read messages waiting in the read queue of a file descriptor opened by the udp client. However, this time, there is no data waiting to be read.
A DEV_READ message (message 1) arrives and sr_rec() calls sr_rwio() to handle the message. Since there are no messages in the read queue, SFF_READ_IP is not raised. After setting the SFF_READ_IP flag, sr_rwio() calls udp_read(). However, since no data is waiting to be read, udp_read() returns NW_SUSPEND. sr_rwio() then sets the SFF_READ_SUSP flag and returns NW_SUSPEND to sr_rec(), which then sends a REVIVE message back to the file system (in other words, it instructs the FS to revive the process requesting the read). After this, the sr file descriptor is shown:
A second DEV_READ message (message 2) arrives and sr_rec() again calls sr_rwio(). Because the SFF_READ_IP flag is already raised by the first message, the second message is appended to the read queue and sr_rwio() returns NW_SUSPEND. Again, sr_reply() sends a REVIVE message to the file system.
A third DEV_READ message (message 3) arrives and, as before, is placed at the end of the read queue and a REVIVE message is sent to the file system.
Finally, some data arrives (possibly from the ethernet task). udp_packet2user() is ultimately called and it, in turn, calls sr_put_userdata() twice: the first time to deliver the packet to the buffer of the user process and the second time to return the number of bytes delivered to the user process. The second time sr_put_data() is called, it calls sr_reply(), which sets the read queue head to null.
After sr_reply() returns, sr_put_userdata() then calls process_req_q(), passing in mq as the first parameter. process_req_q() goes through the remaining messages, starting with mq (see figure above), handling the read requests (by calling sr_rwio()) if possible. If there is enough data to satisfy all of the read requests (i.e., the requests of messages 2 and 3), the read queue is as appears in the figure on the left. If, for example, there is not even enough data to satisfy the request of message 2, the message queue is as appears in the figure on the right.
If there was enough data to satisfy the requests of messages 2 and 3, the remaining messages are freed by sr_reply(), just as message 1 was previously freed. Note that the sr_read_q field in the right-hand figure above was set by sr_rwio() before returning.
0015621 if (result == SUSPEND)
sr_rwio() returns SUSPEND if the request couldn't be satisifed. For example, if there was no data to be read and the message was a read request, sr_rwio() would return SUSPEND.
0015622 {
0015623 if (mq)
0015624 {
0015625 (*tail_ptr)->mq_next= mq;
0015626 *tail_ptr= tail;
Reset the tail. Each time that sr_rwio() was called to process a message, both the head and the tail pointed to the message.
0015627 }
0015628 return;
0015629 }
0015630 }
0015631 return;
0015632 }
0015633
0015634 PRIVATE int cp_u2b (proc, src, var_acc_ptr, size)
0015635 int proc;
0015636 char *src;
0015637 acc_t **var_acc_ptr;
0015638 int size;
cp_u2b()
cp_u2b() copies size (the fourth parameter) bytes from an address in a user process (src, the second parameter) to a linked list of buffers within the network service. The address of the first buffer in the linked list is returned through the third parameter. The process number of the user process is the first parameter and is a field in the message requesting the copy.
cp_u2b() is called only by sr_get_userdata().
"u2b" stands for "user to buffer".
0015639 {
0015640 static message mess;
0015641 acc_t *acc;
0015642 int i;
0015643
0015644 acc= bf_memreq(size);
bf_memreq()
After the buffers have been initialized, accessors[] looks like the following:
bf_memreq() allocates accessors to the caller. For example, if 1514 bytes of buffer space are requested immediately after the network process starts and each buffer is 512 bytes (the default), then accessors[] will look like the following:
Note that three elements of accessors[] have been removed from buf512_freelist and that the head of the chain of the 3 accessors is returned by bf_memreq(). Also note that the acc_linkC and buf_linkC fields have been set to one and acc_length and acc_offset have been set to their appropriate values.
So what happens if there are not enough buffers on the buf512_freelist to satisfy a request? On lines 2280-2290 of buf.c, functions that free buffers for the specific clients (e.g., eth_buffree()) are called until there are enough buffers on buf512_freelist.
For a complete description of the network service's buffer management, click here.
0015645
0015646 *var_acc_ptr= acc;
One means of passing back a value to the caller of a function is through a reference. In this way, cp_u2b() passes back the address of the linked list of allocated buffers containing the data copied from the user process (see line 15543).
0015647 i=0;
0015648
0015649 while (acc)
This while loop sets up the SYS_VCOPY (V stands for vector) message on line 15668. Before the message is sent, a vector (cpvec) of all the addresses and sizes of the buffers must be built. The data within the user process is copied to the addresses contained within this vector.
0015650 {
0015651 size= (vir_bytes)acc->acc_length;
0015652
cpvec[] is defined in this file:
PRIVATE cpvec_t cpvec[CPVEC_NR];
and cpvec_t is declared in include/minix/type.h:
typedef struct {
vir_bytes cpv_src; /* src address of data */
vir_bytes cpv_dst; /* dst address of data */
vir_bytes cpv_size; /* size of data */
} cpvec_t;
0015653 cpvec[i].cpv_src= (vir_bytes)src;
0015654 cpvec[i].cpv_dst= (vir_bytes)ptr2acc_data(acc);
ptr2acc_data()
The macro ptr2acc_data is #define'd in inet/generic/buf.h as:
#define ptr2acc_data(/* acc_t * */ a) (bf_temporary_acc=(a), \
(&bf_temporary_acc->acc_buffer->buf_data_p[bf_temporary_acc-> \
acc_offset]))
ptr2acc_data() simply returns a pointer to the actual data within an accessor.
ptr2acc_data() is usually called so that the fields of a header (e.g., ip header) can be analyzed.
0015655 cpvec[i].cpv_size= size;
0015656
0015657 src += size;
0015658 acc= acc->acc_next;
0015659 i++;
0015660
0015661 if (i == CPVEC_NR || acc == NULL)
For a SYS_VCOPY request, the vector can be a maximum of CPVEC_NR (16) elements. If the size of the vector is less than or equal to CPVEC_NR elements, all of the data from the user process can be copied with a single message. If not, multiple messages to SYSTASK must be sent.
0015662 {
0015663 mess.m_type= SYS_VCOPY;
0015664 mess.m1_i1= proc;
0015665 mess.m1_i2= this_proc;
0015666 mess.m1_i3= i;
0015667 mess.m1_p1= (char *)cpvec;
0015668 if (sendrec(SYSTASK, &mess) <0)
0015669 ip_panic(("unable to sendrec"));
Send the message to SYSTASK (see OSDI, Section 3.10) requesting the copy and wait for it to finish.
The message will have the following format:
m_source m_type (SYS_VCOPY) src process dst process vec size (m1_i3) vc addr (char *m1p1) - -
Click here for a detailed description of sendrec().
0015670 if (mess.m_type <0)
0015671 {
0015672 bf_afree(*var_acc_ptr);
bf_afree()
After a chain of accessors is no longer needed, the chain (and not simply the single accessor passed as the parameter) can be freed by calling bf_free(). However, if either acc_linkC or buf_linkC of one of the accessors in the linked list is not equal to one (1), the entire chain will not be freed. For example, if buf_afree(acc1) is called for the following chain:
Then the resulting chain will be:
bf_afree() returns acc1 (accessors[63]) to acc_freelist (recall that acc_freelist is the linked list of acc_t's without an associated buffer). However, buffers512[127] cannot be freed because acc2 (accessors[64]) still references it.
bf_afree() is called after an accessor's associated data is no longer needed (for example, after a packet has been sent off by the ethernet driver).
0015673 *var_acc_ptr= 0;
0015674 return mess.m_type;
0015675 }
0015676 i= 0;
0015677 }
0015678 }
0015679 return OK;
0015680 }
0015681
0015682 PRIVATE int cp_b2u (acc_ptr, proc, dest)
0015683 acc_t *acc_ptr;
0015684 int proc;
0015685 char *dest;
cp_b2u()
cp_u2b(acc_ptr, proc, dest) copies all of the data in the accessor linked list acc_ptr, cp_u2b()'s first buffer, to a location dest, cp_u2b()'s third parameter, in the user process proc, cp_b2u()'s third parameter.
cp_u2b() is called only by sr_put_userdata().
"b2u" stands for "buffer to user".
0015686 {
0015687 static message mess;
0015688 acc_t *acc;
0015689 int i, size;
0015690
0015691 acc= acc_ptr;
0015692 i=0;
0015693
0015694 while (acc)
This while loop sets up the SYS_VCOPY (V stands for vector) message on line 15716. Before the message is sent, a vector (cpvec) of all the addresses and sizes of the buffers must be built. The data within these buffers is copied to the user process.
0015695 {
0015696 size= (vir_bytes)acc->acc_length;
0015697
0015698 if (size)
0015699 {
cpvec[] is defined in this file:
PRIVATE cpvec_t cpvec[CPVEC_NR];
and cpvec_t is declared in include/minix/type.h:
typedef struct {
vir_bytes cpv_src; /* src address of data */
vir_bytes cpv_dst; /* dst address of data */
vir_bytes cpv_size; /* size of data */
} cpvec_t;
0015700 cpvec[i].cpv_src= (vir_bytes)ptr2acc_data(acc);
ptr2acc_data()
The macro ptr2acc_data is #define'd in inet/generic/buf.h as:
#define ptr2acc_data(/* acc_t * */ a) (bf_temporary_acc=(a), \
(&bf_temporary_acc->acc_buffer->buf_data_p[bf_temporary_acc-> \
acc_offset]))
ptr2acc_data() simply returns a pointer to the actual data within an accessor.
ptr2acc_data() is usually called so that the fields of a header (e.g., ip header) can be analyzed.
0015701 cpvec[i].cpv_dst= (vir_bytes)dest;
0015702 cpvec[i].cpv_size= size;
0015703 i++;
0015704 }
0015705
0015706 dest += size;
0015707 acc= acc->acc_next;
0015708
0015709 if (i == CPVEC_NR || acc == NULL)
SYS_VCOPY is a message that a server (network service, file system, memory manager, etc.) sends to the system task to request a memory copy. Note that a user process cannot send this message. For a detailed description, see section 3.10 of Operating Systems, Design and Implementation.
For a SYS_VCOPY request, the vector can be a maximum of CPVEC_NR (16) elements. If the size of the vector is less than or equal to CPVEC_NR elements, all of the data from the user process can be copied with a single message. If not, multiple messages to SYSTASK must be sent.
0015710 {
0015711 mess.m_type= SYS_VCOPY;
0015712 mess.m1_i1= this_proc;
this_proc is a global variable that is equal to the process number of the network service. this_proc was set during the initialization of the network service.
0015713 mess.m1_i2= proc;
0015714 mess.m1_i3= i;
0015715 mess.m1_p1= (char *)cpvec;
0015716 if (sendrec(SYSTASK, &mess) <0)
Send the message to SYSTASK (see OSDI, Section 3.10) requesting the copy and wait for it to finish.
The message will have the following format:
m_source m_type (SYS_VCOPY) src process dst process vec size (m1_i3) vc addr (char *m1p1) - -
Click here for a detailed description of sendrec().
0015717 ip_panic(("unable to sendrec"));
0015718 if (mess.m_type <0)
0015719 {
0015720 bf_afree(acc_ptr);
bf_afree()
After a chain of accessors is no longer needed, the chain (and not simply the single accessor passed as the parameter) can be freed by calling bf_free(). However, if either acc_linkC or buf_linkC of one of the accessors in the linked list is not equal to one (1), the entire chain will not be freed. For example, if buf_afree(acc1) is called for the following chain:
Then the resulting chain will be:
bf_afree() returns acc1 (accessors[63]) to acc_freelist (recall that acc_freelist is the linked list of acc_t's without an associated buffer). However, buffers512[127] cannot be freed because acc2 (accessors[64]) still references it.
bf_afree() is called after an accessor's associated data is no longer needed (for example, after a packet has been sent off by the ethernet driver).
0015721 return mess.m_type;
0015722 }
0015723 i= 0;
0015724 }
0015725 }
0015726 bf_afree(acc_ptr);
bf_afree()
After a chain of accessors is no longer needed, the chain (and not simply the single accessor passed as the parameter) can be freed by calling bf_free(). However, if either acc_linkC or buf_linkC of one of the accessors in the linked list is not equal to one (1), the entire chain will not be freed. For example, if buf_afree(acc1) is called for the following chain:
Then the resulting chain will be:
bf_afree() returns acc1 (accessors[63]) to acc_freelist (recall that acc_freelist is the linked list of acc_t's without an associated buffer). However, buffers512[127] cannot be freed because acc2 (accessors[64]) still references it.
bf_afree() is called after an accessor's associated data is no longer needed (for example, after a packet has been sent off by the ethernet driver).
0015727 return OK;
0015728 }
0015729
0015730 PRIVATE int sr_repl_queue(proc, ref, operation)
0015731 int proc;
0015732 int ref;
0015733 int operation;
sr_repl_queue()
If repl_queue is not empty, sr_repl_queue(proc, ref, operation) is called to send all messages (which will be REVIVE messages) in repl_queue to the file system. If called with proc, sr_repl_queue()'s first parameter, set to ANY (#define'd in include/minix/com.h - this value is not within allowable range of process numbers), sr_repl_queue() goes through repl_queue and sends each reply message to the appropriate process. If called with proc set to a specific process number (which is the case when sr_rec() is processing an NW_CANCEL message), sr_repl_queue() goes through repl_queue and sends every REVIVE message except for any messages destined for the process specified by proc (if any exist). sr_repl_queue() finally sends the message destined to the process specified by proc after all other messages have been sent. While the file system is still interested in the REVIVE message, the priority goes to other replies.
In addition to sending all the reply messages queued in repl_queue, sr_repl_queue() frees the messages in repl_queue for later use and sets repl_queue to NULL.
A return value of 1 indicates that a message in repl_queue was canceled. A return value of 0 indicates that no messages were canceled.
The second and third parameters (ref and operation) are never used in sr_repl_queue().
0015734 {
0015735 mq_t *m, *m_cancel, *m_tmp;
0015736 int result;
0015737
0015738 m_cancel= NULL;
0015739
0015740 for (m= repl_queue; m;)
Attempt to send each of the REVIVE messages in repl_queue to the file system.
repl_queue / repl_queue_tail
repl_queue is the head of the linked list (repl_queue_tail is the tail) of REVIVE messages that had been previously (unsuccessfully) sent to the file system (FS) but had not been delivered due to deadlock. Before any new messages are processed by the network service, a second (and last) attempt is made by send_repl_queue() to send out these REVIVE messages in repl_queue.
For a detailed description of what circumstances lead to a deadlock, click here.
0015741 {
0015742 if (m->mq_mess.REP_PROC_NR == proc)
Send a REVIVE message to the process that just requested an NW_CANCEL last(see line 15128). This message is handled on lines 15759 - 15768.
0015743 {
0015744 assert(!m_cancel);
0015745 m_cancel= m;
0015746 m= m->mq_next;
0015747 continue;
0015748 }
0015749 assert(m->mq_mess.m_source != MM_PROC_NR);
0015750 assert(m->mq_mess.m_type == REVIVE);
0015751 result= send(m->mq_mess.m_source, &m->mq_mess);
Attempt to send the REVIVE message to the file system. This message will be followed by an NW_CANCEL message.
Click here for a detailed description of the send/receive mechanism.
0015752 if (result != OK)
0015753 ip_panic(("unable to send: %d", result));
0015754 m_tmp= m;
0015755 m= m->mq_next;
0015756 mq_free(m_tmp);
mq_free()
mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.
Click here for a detailed description of mq_list[] and mq_freelist.
0015757 }
0015758 repl_queue= NULL;
0015759 if (m_cancel)
If a process just sent an NW_CANCEL message and there is a REVIVE message in repl_queue destined for this process, send this REVIVE message last.
0015760 {
0015761 assert(m_cancel->mq_mess.m_source != MM_PROC_NR);
0015762 assert(m_cancel->mq_mess.m_type == REVIVE);
0015763 result= send(m_cancel->mq_mess.m_source, &m_cancel->mq_mess);
Attempt to send the REVIVE message to the file system. This message will be followed by an NW_CANCEL message.
Click here for a detailed description of the send/receive mechanism.
0015764 if (result != OK)
0015765 ip_panic(("unable to send: %d", result));
0015766 mq_free(m_cancel);
mq_free()
mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.
Click here for a detailed description of mq_list[] and mq_freelist.
0015767 return 1;
0015768 }
0015769 return 0;
0015770 }
0015771
0015772 /*
0015773 * $PchId: sr.c,v 1.9 1996/05/07 21:11:14 philip Exp $
0015774 */
All of the code in this file revolves around the array sr_fd_table[], which is described below.
sr_fd / sr_fd_table[] / sr_add_minor()
One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:
eth0 DP8390 0 { default; };
psip1;
an element (i.e., a struct sr_fd) is configured for each of the following devices:
/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]
/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]
sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).
When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().
int srf_flags:
srf_flags is a combination of the following:
#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP
srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.
When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.
int srf_fd, srf_port:
srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:
if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4
For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.
Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).
srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.
sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:
The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.
mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:
The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).
After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().