Warning, /frameworks/kio/docs/design.txt is written in an unsupported language. File is not indexed.

0001 DESIGN:
0002 =======
0003 
0004 The KIO framework uses workers (separate processes) that handle a given protocol.
0005 Launching those workers is taken care of by the kdeinit/klauncher tandem,
0006 which are notified by DBus. (TODO: update to klauncher remove, also below)
0007 
0008 Connection is the most low-level class, the one that encapsulates the pipe.
0009 
0010 WorkerInterface is the main class for transferring anything to the worker
0011 and Worker, which inherits WorkerInterface, is the sub class that Job should handle.
0012 
0013 A worker inherits WorkerBase, which is the other half of WorkerInterface.
0014 
0015 The scheduling is supposed to be on a two level basis. One is in the daemon
0016 and one is in the application. The daemon one (as opposite to the holy one? :)
0017 will determine how many workers are ok for this app to be opened and it will
0018 also assign tasks to actually existing workers.
0019 The application will still have some kind of a scheduler, but it should be
0020 a lot simpler as it doesn't have to decide anything besides which
0021 task goes to which pool of workers (related to the protocol/host/user/port)
0022 and move tasks around.
0023 Currently a design study to name it cool is in scheduler.cpp but in the
0024 application side. This is just to test other things like recursive jobs
0025 and signals/slots within WorkerInterface. If someone feels brave, the scheduler
0026 is yours!
0027 On a second thought: at the daemon side there is no real scheduler, but a
0028 pool of workers. So what we need is some kind of load calculation of the
0029 scheduler in the application and load balancing in the daemon. 
0030 
0031 A third thought: Maybe the daemon can just take care of a number of 'unused'
0032 workers. When an application needs a worker, it can request it from the daemon.
0033 The application will get one, either from the pool of unused workers,
0034 or a new one will be created. This keeps things simple at the daemon level.
0035 It is up to the application to give the workers back to the daemon.
0036 The scheduler in the application must take care not to request too many
0037 workers and could implement priorities.
0038 
0039 Thought on usage:
0040 * Typically a single worker-type is used exclusively in one application. E.g.
0041 http workers are used in a web-browser. POP3 workers used in a mail program.
0042 
0043 * Sometimes a single program can have multiple roles. E.g. konqueror is
0044 both a web-browser and a file-manager. As a web-browser it primarily uses
0045 http-workers as a file-manager file-workers.
0046 
0047 * Selecting a link in konqueror: konqueror does a partial download of
0048 the file to check the MIME type (right??) then the application is
0049 started which downloads the complete file. In this case it should 
0050 be able to pass the worker which does the partial download from konqueror
0051 to the application where it can do the complete download.
0052 
0053 Do we need to have a hard limit on the number of workers/host?
0054 It seems so, because some protocols are about to fail if you
0055 have two workers running in parallel (e.g. POP3)
0056 This has to be implemented in the daemon because only at daemon
0057 level all the workers are known. As a consequence workers must
0058 be returned to the daemon before connecting to another host.
0059 (Returning the workers back to the daemon after every job is not
0060 strictly needed and only causes extra overhead) 
0061 
0062 Instead of actually returning the worker to the daemon, it could
0063 be enough to ask 'recycling permission' from the daemon: the 
0064 application asks the daemon whether it is ok to use a worker for
0065 another host. The daemon can then update its administration of
0066 which worker is connected to which host.
0067 
0068 The above does of course not apply to hostless protocols (like file).
0069 (They will never change host).
0070 
0071 Apart from a 'hard limit' on the number of workers/host we can have
0072 a 'soft limit'. E.g. upon connection to a HTTP 1.1 server, the web-
0073 server tells the worker the number of parallel connections allowed.
0074 THe simplest solution seems to be to treat 'soft limits' the same
0075 as 'hard limits'. This means that the worker has to communicate the
0076 'soft limit' to the daemon.
0077 
0078 Jobs using multiple workers.
0079 
0080 If a job needs multiple workers in parallel (e.g. copying a file from
0081 a web-server to a ftp-server or browsing a tar-file on a ftp-site)
0082 we must make sure to request the daemon for all workers together since
0083 otherwise there is a risk of deadlock. 
0084 
0085 (If two applications both need a 'pop3' and a 'ftp' worker for a single
0086 job and only a single worker/host is allowed for pop3 and ftp, we must
0087 prevent giving the single pop3 worker to application #1 and the single
0088 ftp worker to application #2. Both applications will then wait till the
0089 end of times till they get the other worker so that they can start the
0090 job. (This is a quite unlikely situation, but nevertheless possible))
0091 
0092 
0093 File Operations:
0094 listRecursive is implemented as listDir and finding out if in the result
0095  is a directory. If there is, another listDir job is issued. As listDir
0096  is a readonly operation it fails when a directory isn't readable
0097   .. but the main job goes on and discards the error, because
0098 bIgnoreSubJobsError is true, which is what we want (David)
0099 
0100 del is implemented as listRecursive, removing all files and removing all
0101  empty directories. This basically means if one directory isn't readable
0102  we don't remove it as listRecursive didn't find it. But the del will later
0103  on try to remove it's parent directory and fail. But there are cases when
0104  it would be possible to delete the dir in chmod the dir before. On the
0105  other hand del("/") shouldn't list the whole file system and remove all
0106  user owned files just to find out it can't remove everything else (this
0107  basically means we have to take care of things we can remove before we try)
0108 
0109  ... Well, rm -rf / refuses to do anything, so we should just do the same:
0110  use a listRecursive with bIgnoreSubJobsError = false. If anything can't
0111  be removed, we just abort. (David)
0112  
0113  ... My concern was more that the fact we can list / doesn't mean we can
0114  remove it. So we shouldn't remove everything we could list without checking
0115  we can. But then the question arises how do we check whether we can remove it?
0116  (Stephan)
0117 
0118  ... I was wrong, rm -rf /, even as a user, lists everything and removes
0119  everything it can (don't try this at home!). I don't think we can do
0120  better, unless we add a protocol-dependent "canDelete(path)", which is
0121  _really_ not easy to implement, whatever protocol. (David)
0122 
0123 
0124 Lib docu
0125 ========
0126 
0127 mkdir: ...
0128 
0129 rmdir: ...
0130 
0131 chmod: ...
0132 
0133 special: ...
0134 
0135 stat: ...
0136 
0137 get is implemented as TransferJob. Clients get 'data' signals with the data.
0138 A data block of zero size indicates end of data (EOD)
0139 
0140 put is implemented as TransferJob. Clients have to connect to the 
0141 'dataReq' signal. The worker will call you when it needs your data.
0142 
0143 mimetype: ...
0144 
0145 file_copy: copies a single file, either using CMD_COPY if the worker
0146            supports that or get & put otherwise.
0147 
0148 file_move: moves a single file, either using CMD_RENAME if the worker
0149            supports that, CMD_COPY + del otherwise, or eventually
0150            get & put & del.
0151 
0152 file_delete: delete a single file. 
0153 
0154 copy: copies a file or directory, recursively if the latter
0155 
0156 move: moves a file or directory, recursively if the latter
0157 
0158 del: deletes a file or directory, recursively if the latter
0159 
0160 Resuming
0161 --------
0162 If a .part file exists, KIO offers to resume the download.
0163 This requires negotiation between the worker that reads
0164 (handled by the get job) and the worker that writes
0165 (handled by the put job).
0166 
0167 Here's how the negotiation goes.
0168 (PJ=put-job, GJ=get-job)
0169 
0170 PJ can't resume:
0171 PJ-->app: canResume(0)  (emitted by dataReq)
0172 GJ-->app: data()
0173 PJ-->app: dataReq()
0174 app->PJ: data()
0175 
0176 PJ can resume but GJ can't resume:
0177 PJ-->app: canResume(xx)
0178 app->GJ: start job with "resume=xxx" metadata.
0179 GJ-->app: data()
0180 PJ-->app: dataReq()
0181 app->PJ: data()
0182 
0183 PJ can resume and GJ can resume:
0184 PJ-->app: canResume(xx)
0185 app->GJ: start job with "resume=xxx" metadata.
0186 GJ-->app: canResume(xx)
0187 GJ-->app: data()
0188 PJ-->app: dataReq()
0189 app->PJ: canResume(xx)
0190 app->PJ: data()
0191 
0192 So when the worker supports resume for "put" it has to check after the first
0193 dataRequest() whether it has got a canResume() back from the app. If it did 
0194 it must resume. Otherwise it must start from 0.
0195 
0196 Protocols
0197 =========
0198 
0199 Most KIO workers (but not all) are implementing internet protocols.
0200 In this case, the worker name matches the URI name for the protocol.
0201 A list of such URIs can be found here, as per RFC 4395:
0202 https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml
0203