Tuesday, July 23, 2013

Possible Grade Manipulation in the 2013 Lebanese Official Exams

This year, like every year, grade 9 and grade 12 students sat for an official exam that would determine how they will continue their education. And like every year, the results are published online. This year, Naharnet  published all the results with no regard to student privacy. All you need to do is supply the number of the examiner and the type of exam done (SG, LS... etc), and instantly you get their name and detailed score.

So for fun, I produced a python script that uses the Mechanize library to connect to Naharnet and get the correct html form. It iterates through the student numbers and extracts the results in to a CSV file. After cleaning up for spaces and forward slashes ( /), the results were imported into Microsoft Excel for analysis. (Here is the file of you want to have a look at it.)  Look further down for the updated file.

1:  # -*- coding: cp1252 -*-  
2:  import re  
3:  import mechanize  
4:  import csv  
5:  import socket  
6:  br = mechanize.Browser()  
7:  def opensite():  
8:    try:  
9:      time_t = 6.01  
10:      print "Attempting Connection..."  
11:      br.open("http://www.naharnet.com/exam", timeout = time_t)  
12:      print "Connected."  
13:    except mechanize.URLError, exc:  
14:      if isinstance(exc.reason, socket.timeout):  
15:        print "timeout occurred!"  
16:        opensite()  
17:  def getform():  
18:    try:  
19:      br.form = list(br.forms())[1]  # The form we want is the second on the list
20:      br["exam_category"] = ["SG"]  
21:      br["candidate_id"] = str(n)  
22:      response2 = br.submit()  
23:      return response2.read()  
24:    except mechanize.URLError, exc:  
25:      if isinstance(exc.reason, socket.timeout):  
26:        print "timeout occurred!"  
27:        getform()    
28:  ### Numbers list ###  
29:  a = []  
30:  i = 1  
31:  while i < 1000:  
32:    a.append(i)  
33:    i += 1  
34:  i = 20000  
35:  while i < 21500:  
36:    a.append(i)  
37:    i += 1  
38:  i = 40000  
39:  while i < 40500:  
40:    a.append(i)  
41:    i += 1  
42:  i = 50000  
43:  while i < 52600:  
44:    a.append(i)  
45:    i += 1  
46:  i = 80000  
47:  while i < 80500:  
48:    a.append(i)  
49:    i += 1  
50:  i = 90000  
51:  while i < 90500:  
52:    a.append(i)  
53:    i += 1  
54:  ### END NUMBERS LIST ###  
55:  filename = "scores"+str(a[0])+".csv"  
56:  with open(filename, 'wb') as myfile:  
57:    c = csv.writer(myfile)  
58:    for n in a:  
59:      opensite()  
60:      print "Acquiring record number %s" %(n)  
61:      x = getform() + " "  
62:      sub_list = ["Mathematics", "Physics", "Chemistry", "Arabic", "Foreign language", "Philosophy", "History", "Geography", "Civil education"]   
63:      score_list = []  
64:      for p in sub_list:  
65:        math= x[x.find(p):]  
66:        #print p + ": "  
67:        #print math[math.find("result_grade") + 14 : math.find("result_grade") + 17]  
68:        score = math[math.find("result_grade") + 14 : math.find("result_grade") + 17]  
69:        score_list.append(score)  
70:      c.writerow(score_list)  
71:      print "Record %s appended to file..." %(n)  
72:      print score_list  
73:      myfile.flush()  
74:    myfile.close()  

The results were strange. I wouldn't rule out completely grade manipulation, but I would give my readers the benefit of the doubt and have them come up with their own conclusions.

The results so far are only from the SG series of exams. It takes some time for the program to give me all the records and it crashes sometimes leaving me with half a file. (At that point I see what was the last appended record, modify the list of numbers, and continue the program. This creates a new file called score#.csv and that explains line 55 in the code.)

As soon as I get the rest of the grades I will be publishing the results.

Let us discuss the anomalies:

There seems to be something wrong with the grade distribution of both Arabic and Philosophy. In the case of Arabic for example, not a single student got 16, 21, 26, 31, or 36 out of the total 40 marks. Please not that it IS possible to solve the exam in a certain way and attain those grades. The chart then looks like this:
The grades seem to have been divided into 5 distinct populations. 

In the case of philosophy, the graph looks even more bizarre. There are several possible attainable grades that seem to not exist; a statistical improbability. In fact all the grades are even numbers and there are NO odd numbers.
the rest of the subjects are in the supplied excel file below.

Enjoy the corruption, or the misunderstanding...
Expect more data soon.


All the SG data has been collected and the results have not differed at all.
The Civic Ed exam apparently was easy because the actual average and distribution are skewed to the right. That means a higher average and normal distribution.

Math was fairly distributed as there seems to be no irregular patterns in the distribution.
Physics has several spikes that I am yet to interpret. I'll need to divide the data by region and see where the anomaly occurs.

Here is the new updated file: [SG Grades Excel File]

1 comment:

  1. (Y) thats amazing :D i'd still want my marks changed though i somewhat get a bit better (A)