4. Benford’s Law to detect data manipulation The data for both the analog and the SmartMeter(s) conform very closely visually and statistically to Benford Law’s distribution of first digit (Chi Square p value of 0.90). So, from this standpoint there is no evidence of data manipulation.
5. Measure of Central Tendencies The average electricity consumption readings between the two meter types are almost identical.
6. Unpaired student t test The unpaired student t test is the most common statistical test to check whether a tested sample (Smartmeter) is different from a control group (analog meters). As shown, the p value of 98.9% reflects the probability that those two samples would come from the same population. This means that any difference between the two is trivial and due to randomness.
7. Testing whether samples come from populations that are normally distributed The Jarque-Bera test checks what is the probability (p value) that the samples in meters readings come from populations that are normally distributed. This test figures this out using the Skewness and Kurtosis of the samples. In this case, it deducts that there is a 0% probability that the samples come from populations that are normally distributed. Thus, we can’t just rely on the student unpaired t test that assumes that the samples come from normally distributed populations. We have to use a nonparametric test instead that relaxes the normal distribution assumption.
8. Mann-Whitney test Because the meter readings are not normally distributed, we have to use a nonparametric test, such as the Mann-Whitney test, that relaxes the normal distribution assumption. The main difference between the Mann-Whitney test and the unpaired student t test is that the Mann-Whitney test deals with average rank instead of average value. The Mann Whitney test directionally generates the same answer as the student t test (p value 98.8%). Thus, any difference between the two types of meters is due to randomness.
9. Viewing differences on a scatter plot This scatter plot graphs the cumulative usage in kWh on the x-axis and percentage difference on the y-axis between the two types of meters. This difference shrinks as cumulative usage increases. This suggests that difference between the two meters may be driven by discrepancies occurring during the set up of the meters. And, that the remainder of the period generates very accurate and comparable readings. Thus, the initial discrepancy shrinks naturally in % term as the cumulative usage increases.