Warning, /frameworks/kio/docs/design.txt is written in an unsupported language. File is not indexed.
0001 DESIGN: 0002 ======= 0003 0004 The KIO framework uses workers (separate processes) that handle a given protocol. 0005 Launching those workers is taken care of by the kdeinit/klauncher tandem, 0006 which are notified by DBus. (TODO: update to klauncher remove, also below) 0007 0008 Connection is the most low-level class, the one that encapsulates the pipe. 0009 0010 WorkerInterface is the main class for transferring anything to the worker 0011 and Worker, which inherits WorkerInterface, is the sub class that Job should handle. 0012 0013 A worker inherits WorkerBase, which is the other half of WorkerInterface. 0014 0015 The scheduling is supposed to be on a two level basis. One is in the daemon 0016 and one is in the application. The daemon one (as opposite to the holy one? :) 0017 will determine how many workers are ok for this app to be opened and it will 0018 also assign tasks to actually existing workers. 0019 The application will still have some kind of a scheduler, but it should be 0020 a lot simpler as it doesn't have to decide anything besides which 0021 task goes to which pool of workers (related to the protocol/host/user/port) 0022 and move tasks around. 0023 Currently a design study to name it cool is in scheduler.cpp but in the 0024 application side. This is just to test other things like recursive jobs 0025 and signals/slots within WorkerInterface. If someone feels brave, the scheduler 0026 is yours! 0027 On a second thought: at the daemon side there is no real scheduler, but a 0028 pool of workers. So what we need is some kind of load calculation of the 0029 scheduler in the application and load balancing in the daemon. 0030 0031 A third thought: Maybe the daemon can just take care of a number of 'unused' 0032 workers. When an application needs a worker, it can request it from the daemon. 0033 The application will get one, either from the pool of unused workers, 0034 or a new one will be created. This keeps things simple at the daemon level. 0035 It is up to the application to give the workers back to the daemon. 0036 The scheduler in the application must take care not to request too many 0037 workers and could implement priorities. 0038 0039 Thought on usage: 0040 * Typically a single worker-type is used exclusively in one application. E.g. 0041 http workers are used in a web-browser. POP3 workers used in a mail program. 0042 0043 * Sometimes a single program can have multiple roles. E.g. konqueror is 0044 both a web-browser and a file-manager. As a web-browser it primarily uses 0045 http-workers as a file-manager file-workers. 0046 0047 * Selecting a link in konqueror: konqueror does a partial download of 0048 the file to check the MIME type (right??) then the application is 0049 started which downloads the complete file. In this case it should 0050 be able to pass the worker which does the partial download from konqueror 0051 to the application where it can do the complete download. 0052 0053 Do we need to have a hard limit on the number of workers/host? 0054 It seems so, because some protocols are about to fail if you 0055 have two workers running in parallel (e.g. POP3) 0056 This has to be implemented in the daemon because only at daemon 0057 level all the workers are known. As a consequence workers must 0058 be returned to the daemon before connecting to another host. 0059 (Returning the workers back to the daemon after every job is not 0060 strictly needed and only causes extra overhead) 0061 0062 Instead of actually returning the worker to the daemon, it could 0063 be enough to ask 'recycling permission' from the daemon: the 0064 application asks the daemon whether it is ok to use a worker for 0065 another host. The daemon can then update its administration of 0066 which worker is connected to which host. 0067 0068 The above does of course not apply to hostless protocols (like file). 0069 (They will never change host). 0070 0071 Apart from a 'hard limit' on the number of workers/host we can have 0072 a 'soft limit'. E.g. upon connection to a HTTP 1.1 server, the web- 0073 server tells the worker the number of parallel connections allowed. 0074 THe simplest solution seems to be to treat 'soft limits' the same 0075 as 'hard limits'. This means that the worker has to communicate the 0076 'soft limit' to the daemon. 0077 0078 Jobs using multiple workers. 0079 0080 If a job needs multiple workers in parallel (e.g. copying a file from 0081 a web-server to a ftp-server or browsing a tar-file on a ftp-site) 0082 we must make sure to request the daemon for all workers together since 0083 otherwise there is a risk of deadlock. 0084 0085 (If two applications both need a 'pop3' and a 'ftp' worker for a single 0086 job and only a single worker/host is allowed for pop3 and ftp, we must 0087 prevent giving the single pop3 worker to application #1 and the single 0088 ftp worker to application #2. Both applications will then wait till the 0089 end of times till they get the other worker so that they can start the 0090 job. (This is a quite unlikely situation, but nevertheless possible)) 0091 0092 0093 File Operations: 0094 listRecursive is implemented as listDir and finding out if in the result 0095 is a directory. If there is, another listDir job is issued. As listDir 0096 is a readonly operation it fails when a directory isn't readable 0097 .. but the main job goes on and discards the error, because 0098 bIgnoreSubJobsError is true, which is what we want (David) 0099 0100 del is implemented as listRecursive, removing all files and removing all 0101 empty directories. This basically means if one directory isn't readable 0102 we don't remove it as listRecursive didn't find it. But the del will later 0103 on try to remove it's parent directory and fail. But there are cases when 0104 it would be possible to delete the dir in chmod the dir before. On the 0105 other hand del("/") shouldn't list the whole file system and remove all 0106 user owned files just to find out it can't remove everything else (this 0107 basically means we have to take care of things we can remove before we try) 0108 0109 ... Well, rm -rf / refuses to do anything, so we should just do the same: 0110 use a listRecursive with bIgnoreSubJobsError = false. If anything can't 0111 be removed, we just abort. (David) 0112 0113 ... My concern was more that the fact we can list / doesn't mean we can 0114 remove it. So we shouldn't remove everything we could list without checking 0115 we can. But then the question arises how do we check whether we can remove it? 0116 (Stephan) 0117 0118 ... I was wrong, rm -rf /, even as a user, lists everything and removes 0119 everything it can (don't try this at home!). I don't think we can do 0120 better, unless we add a protocol-dependent "canDelete(path)", which is 0121 _really_ not easy to implement, whatever protocol. (David) 0122 0123 0124 Lib docu 0125 ======== 0126 0127 mkdir: ... 0128 0129 rmdir: ... 0130 0131 chmod: ... 0132 0133 special: ... 0134 0135 stat: ... 0136 0137 get is implemented as TransferJob. Clients get 'data' signals with the data. 0138 A data block of zero size indicates end of data (EOD) 0139 0140 put is implemented as TransferJob. Clients have to connect to the 0141 'dataReq' signal. The worker will call you when it needs your data. 0142 0143 mimetype: ... 0144 0145 file_copy: copies a single file, either using CMD_COPY if the worker 0146 supports that or get & put otherwise. 0147 0148 file_move: moves a single file, either using CMD_RENAME if the worker 0149 supports that, CMD_COPY + del otherwise, or eventually 0150 get & put & del. 0151 0152 file_delete: delete a single file. 0153 0154 copy: copies a file or directory, recursively if the latter 0155 0156 move: moves a file or directory, recursively if the latter 0157 0158 del: deletes a file or directory, recursively if the latter 0159 0160 Resuming 0161 -------- 0162 If a .part file exists, KIO offers to resume the download. 0163 This requires negotiation between the worker that reads 0164 (handled by the get job) and the worker that writes 0165 (handled by the put job). 0166 0167 Here's how the negotiation goes. 0168 (PJ=put-job, GJ=get-job) 0169 0170 PJ can't resume: 0171 PJ-->app: canResume(0) (emitted by dataReq) 0172 GJ-->app: data() 0173 PJ-->app: dataReq() 0174 app->PJ: data() 0175 0176 PJ can resume but GJ can't resume: 0177 PJ-->app: canResume(xx) 0178 app->GJ: start job with "resume=xxx" metadata. 0179 GJ-->app: data() 0180 PJ-->app: dataReq() 0181 app->PJ: data() 0182 0183 PJ can resume and GJ can resume: 0184 PJ-->app: canResume(xx) 0185 app->GJ: start job with "resume=xxx" metadata. 0186 GJ-->app: canResume(xx) 0187 GJ-->app: data() 0188 PJ-->app: dataReq() 0189 app->PJ: canResume(xx) 0190 app->PJ: data() 0191 0192 So when the worker supports resume for "put" it has to check after the first 0193 dataRequest() whether it has got a canResume() back from the app. If it did 0194 it must resume. Otherwise it must start from 0. 0195 0196 Protocols 0197 ========= 0198 0199 Most KIO workers (but not all) are implementing internet protocols. 0200 In this case, the worker name matches the URI name for the protocol. 0201 A list of such URIs can be found here, as per RFC 4395: 0202 https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml 0203