-
Notifications
You must be signed in to change notification settings - Fork 0
/
ChangeLog
309 lines (258 loc) · 12.8 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
ChangeLog for CloudScheduling project
==================================================
2012-10-18 Lipu Fei Started this project.
==================================================
Changes:
1. A scheduling framework has been created seems fine so far.
Finished Works:
1. The scheduling framework has been simply tested.
2. The module org.cloudability.util.CloudConfig has been simply tested.
Todos:
1. Finish implementing the Job execution procedure. SSH modules in KOALA may be
needed.
2. Create a more general job allocation policy framework for the scheduler.
3. Create a more general resouce provisioning policy framework.
4. Integrate the XML-RPC modules previous implemented into this project.
5. Integrate the request listening modules.
6. Design and implement a JSDL parser.
Notes:
1. The Job's execution procedure can first be implemented in a simplified way
which can satisfy the needs of in4392 lab. Then to think of a more general
way.
2. Implementing a flexible framework for resource provisioning policies seems
to be very difficult and complicated. So far, this framework is sufficient
for the in4392 lab.
3. JSDL parser can wait. It is better to design and create a simple job
description file format for submitting job, which can be utilized for in4392
lab.
2012-10-19 Lipu Fei
==================================================
Fixes:
1. Job's equals() function was not correct.
Changes:
1. Implemented equals() for VMInstance.
2. Assigning a VM instance to a job is now moved from JobMonitor to the
allocator. It is allocator's responsibilty to get a job and a VM instance.
3. A sort() has been implemented for JobQueue to sort the queue with a given
comparator.
4. A base class of allocation policies has been implemented. A FCFSAllocator and
a helper class FCFSJobComparator have been implemented.
5. Attribute arrivalTime has been added to Job.
6. Provisioner, a base class for provisioning policies has been implemented. A
simple static provision policy has been implemented.
7. The allocation and provisioning logic has been moved from Scheduler to
Allocator.
Todos:
1. Add a finialization method for ResourceManager to stop the provisioning
policy thread.
2. Integrate the Cloud Manipulators into this framework and complete the static
provisioning policy with it.
3. Pofiler.
4. Integrate Request Handler.
2012-10-22 Lipu Fei
==================================================
Changes:
1. A simple profiler has been added.
2. The stopping of provisioner has been added in the resource manager.
3. Client request handling modules have been added.
4. A simple request parser has been implemented.
5. Cloud broker has been added.
Todos:
1. Test them.
2012-10-23 Lipu Fei
==================================================
Changes:
1. The scheduling procedure has been changed. Now the scheduler first checks for
available VM instance and then gets a job, because it makes no sense to
allocate a job with no idle VM instance available.
2. The job now has a hashmap that contains all related parameters it needs.
3. Ported and tested the SSH module in KOALA.
4. Started implementing the job execution procedure.
5.
Todos:
1. For now, the scheduler doesn't keep track on each job and its monitor, and
the finalization of the scheduler simply does nothing. It would be better to
keep track on the monitors, so that when an exit signal has been raised, the
scheduler can stop these monitors and their jobs gracefully.
2. The uploading and downloading of files during the job execution are simple
blocking methods. There may be a better and more efficient way to do this.
3. Change the authentication of SSH module to using username and password
instead of using known database, because VM instances are created dynamically
and you don't always have them in you known_hosts.
4. Find a good way to monitor the VM instances.
2012-10-24 Lipu Fei
==================================================
Changes:
1. The full execution procedure of the job has been implemented and tested.
2. Now the command of executing the program using ANT file build.xml is
"ant CloudAbility". The old one was "ant CloudAbility (1)" because of some
eclipse launch configuration problems. Now this is fixed.
3. The OpenNebula moudle has been integrated into the project.
4. Full automation has been nearly achieved.
5. A regular check method has been added to the ResourceManager.
6. Now VM instances are created through VMAgents, independent threads that are
responsible for allocating VMs and monitoring them until they are ready to
be used.
7. The shutdown of the system is more graceful then before: it will stop all
existing VMAgents and then finalizes all allocated VM instances.
Todos:
1. Implement profiler for jobs.
2. Achieve full automation (implement the client side).
4. For JobMonitors, find a good way to keep track on them, so that the system
can be shutdown gracefully.
2012-10-25 Lipu Fei
==================================================
Fixes:
1. A critical bug in the static provisioning policy has been fixed. The bug was
located in the method allocate(), which would pick a VMInstance for the
selected Job. This is no correct since in the scheduler, an available
VMInstance has already been picked.
Changes:
1. JobMonitors are now being tracked, and when scheduler is reqiured to be
stopped, it will raise a stop signal to the running JobMonitors, and they
will stop the Jobs that are monitoring.
2. The allocate() method in the provisioner now should occupy a VMInstance
immediately after it is selected, because there is a provisioner thread that
would act according to the current situation (i.e. remove idle VMInstances).
The scheduler picks a VMInstance first, so if this picked VMInstance is
released when the scheduler is still choosing a Job, then an error would
occur.
3. A regular check method has been implemented for the scheduler, which removes
all finished JobMonitors.
4. System status is regularly updated in the scheduler.
Tests:
1. Several tests have been carried out for several times, including 5 jobs with
3 VMs, 5 jobs with 5 VMs.
2. Shutting down the system while having Jobs in execution has been tested
for several times. The system can gracefully shutdown the Jobs, JobMonitors,
VMAgents, and VMInstances created by the system before it is terminated.
3. The profiler for jobs has been implemented. The performance statistics is
gathered in the JobMonitor after a Job has been finished. The current metrics
includes: makespan, running-time, preparation-time, upload-time,
tarball-extraction-time, execution-time, and download-time.
More suitable metrics may be further discussed.
Todos:
1. Improve profiler for jobs.
2. Implement a central statistics database for storing system performance and
job performance.
3. Improve the system, such as synchronization, etc.
Notes:
1. Sometime the allocation of a VM instance may fail. Then, for a provisioning
policy like static, it may not have the specified number of VM instances.
One solution for this is that, we can commence a regular check that makes
sure the #VMAgents + #VMInstances = #DesiredVMInstances. So if some VMAgents
fail to allocate a usable VMInstance, this regular check will notice this
and create another VMAgent to allocate one, so that the total number of
VMInstances should be the desired number.
2012-10-26 10:11 Lipu Fei
==================================================
Changes:
1. Provisioner thread has been removed.
2. The provisioner check is now a sequential method in the scheduling cycle, so
no need to worry about concurrency problems now.
3. The updateStatus method has been moved from Scheduler to DataManager, and it
updates the jobs' waiting times.
4. Jobs now updates its failure times when it fails or it is stopped, and
JobMonitor will put this job back into the pending queue again.
5. The statistics MaxVMsExisting is now available.
6. The System Performance Over Time is now available.
7. A shutdown hook has been added. Now it is possible to exit with CTRL-C.
Todos:
1. Add a threshold to limit the maximum number of failures a job can have.
Otherwise some invalid jobs would be in the pending queue forever.
2. May consider to remove JobMonitor, because there are too many threads in the
system now.
3. The metric PreparationTime doesn't seem to be useful. Consider to remove it.
2012-10-26 11:59 Lipu Fei
==================================================
Changes:
1. Scheduler is now a subclass of Thread instead of implementing Runnable. This
is for control convenience consideration.
2. ClientRequestListener is now a subclass of Thread instead of implementing
Runnable. This is for control convenience consideration.
3. Now there is a timeout for each running job. The timeout is 4 minutes. This
change is made in JobMonitor.
4. VM instances now have aggregate idle times.
5. A new metric "VM preparation time" has been added.
2012-10-26 12:42 Lipu Fei
==================================================
Changes:
1. The format of the statistics output file has been slightly changed.
2. A so-called "Simple Elastic" provisioning policy has been added. Please
update the configuration file.
2012-10-26 16:35 Lipu Fei
==================================================
Changes:
1. The JobProfiler has been changed to a general profiler called "Profiler".
2. Each VMInstance has its own profiler now.
3. The statistics module now collects VMInstance profilers and can generate
individual VMInstance profiling results in the final report.
Todos:
1. VMInstance management and profiling modules are in a total mass right now.
They must be redesigned later.
2012-10-27 14:09 Lipu Fei
==================================================
Changes:
1. The multi-threading mechanism for handling client requests has been changed.
2. The multi-threading mechanism for handling VMAgents has been changed.
3. The shutdown hook has been improved.
4. Statistics data file now prints the currenly date and time.
2012-10-27 15:50 Lipu Fei
==================================================
Changes:
1. A new profiling tool "Recorder" added.
2. VMState has been moved out from VMInstance as an independent enum type.
3. CloudBroker and XmlRpcBroker has been abandoned. The cloud manipulation
functions are now simplified and encapusulated in Apdaters.
4. The VMInstance has been changed, it is no longer the broker's (now, adapter)
job to create, update, or terminate a VM instance. A VMInstance is
responsible to do these things itself.
Known Bugs:
1. There is no NullPointer check is the statistics module, so some metric may
not exist (such as "busyTime" for VMs) and this will cause the saving to file
procedure raise an expection.
2012-10-27 18:38 Lipu Fei
==================================================
Fixes:
1. Fixed the Null pointer bug in Statistics module.
Changes:
1. JobMonitors are abandoned. The system overhead is much lower than before.
2. Statistics module now also collects unfinished job's data when shutting down.
Todos:
1. Improve log4j. Create configuration file. Output logs into seperate files:
One for system, and each job should have a log file., etc.
2012-10-28 13:31 Lipu Fei
==================================================
Fixes:
1. Eliminated warnings in some classes.
Changes:
1. A centralized logger CloudLogger added. Seperate log files for the system and
jobs are enabled.
2. DataManager has been renamed as "CentralManager".
3. ClientRequestListener and ClientRequestHandler are renamed as
"ClientJobListener" and "ClientJobHandler" respectively.
4. The things in org.cloudability.server.job have been moved to package
"org.cloudability.server.job" because other servers may be added in the
future.
Todos:
1. Consider to change the pending job queue to be a blocking queue, and remove
the JobQueue class.
2. It looks like a mess to use a module named "CentralManager" while there is
another manager named "ResourceManager". Although merging them altogether
would create a massive "god" class, but this option may still be considered.
3. Improve statistics module. May consider using DB such as SQLite.
4. Make job executiong more automatic, using a single script file or python
script like what SkyMark does.
5. Create a Web server and a DB server for collecting application data.
6. Add functionality to support Amazon EC2.
7. Consider to implement multi-tenancy.
2012-10-28 22:20 Lipu Fei
==================================================
Fixes:
1. Fixed the problem that not all VM instances are recorded.
2. Fixed a bug in StatisticsManager. The calculation of preparation time of VMs
was not correct.
Changes:
1. A provisioner "VMTestProvisioner" for testing VM allocation and preparation
overhead has been added.
2. Some output text and format have been changed for better readability.