SIT718 Real world Analytics
SIT718 Real world Analytics
Order 100% plagiarism free essay
This assignment assesses :
ULO1: Apply the concepts of multivariate functions to summarise datasets.
ULO2: Analyse datasets by interpreting model and function parameters of impor-tant families of multivariate functions.
ULO3: Transform a real-life problem into a mathematical model.
ULO4: Apply linear programming concepts to make optimal decisions.
ULO6: Obtain optimal solutions for quantities that are either continuous or dis-crete.
This assignment consists of two parts: Part A and Part B. Each part is allocated 50 marks
and contributes with 15% to the nal mark.
1
SIT718 Assignment 2018 { T1 2 of 7
Part A: Analysis of Energy Eciency Dataset for Buildings
Description:
In order to design energy ecient buildings, the computation of the Heating Load (HL) and
the Cooling Load (CL) is required to determine the specications of the heating and cooling
equipment needed to maintain comfortable indoor air conditions. Energy simulation tools are
widely used to analyse or forecast building energy consumption. The Dataset provides energy
analysis of Heating Load (denoted as Y1) and the Cooling Load (denoted as Y2) using 768
building shapes that are simulated using a building simulator. Select one of Y1 or Y2 as your
variable of interest and focus the analysis on this variable. The dataset comprises 5 features
(variables), which are denoted as X1, X2, X3,X4,X5. The description of the variables is given
below:
X1: Relative compactness in percentage (expressed in decimals) – A measure of building
compactness. A high value means highly compact.
X2: Surface area in square metres
X3: Wall area in square metres
X4: Roof area in square metres
X5: Overall height in metres
Y1: Heating load in kWh:m
2
per annum
Y2: Cooling load in kWh:m
2
SIT718 Real world Analytics
Order 100% plagiarism free essay
per annum
Tasks:
1. Understand the data [10 marks]
(i) Download the txt le (ENB18data.txt) from CloudDeakin and save it to your R work-ing directory.
(ii) Assign the data to a matrix, e.g. using
the.data <- as.matrix(read.table(“ENB18data.txt”))
(iii) Decide whether you would like to investigate Heating Load (Y1) or Cooling Load
(Y2). This is your variable of interest. Generate a subset of 300 data, e.g. using:
To investigate Heating Load Y1:
my.data <- the.data[sample(1:768,300),c(1:5,6)]
To investigate Cooling Load Y2:
SIT718 Assignment 2018 { T1 3 of 7
my.data <- the.data[sample(1:768,300),c(1:5,7)]
(iv) Using scatterplots and histograms, report on the general relationship between each
of the variables X1,X2, X3, X4 and X5 and your variable of interest Y1 (heating load)
or Y2 (cooling load). Include a scatter plot for each of the variables X1, X2, X3, X4, X5
and your variable of interest Y1 or Y2. Include a histogram for X1,X2,…,X5, and Y1 or
Y2. Include 1 or 2 sentences about the relationships and distributions.
2. Transform the data [15 marks]
(i) Choose any four from the rst ve variables X1,X2,X3,X4,X5.
Make appropriate transformations to the variables (including Y1 or Y2) so that the val-ues can be aggregated in order to predict the variable of interest (your selected Heating
Load Y1, or cooling load Y2). The transformations should re
ect the general relationship
between each of the four variables and the variable of interest. Assign your transformed
data along with your transformed variable of interest to an array (it should be 300 rows
and 5 columns). Save it to a txt le titled “name-transformed.txt” using
write.table(your.data,”name-transformed.txt”,)
(ii) Brie
y explain each transformation for your selected variables and the variable of
interest Y1 or Y2. (1- 2 sentences each).
3. Build models and investigate the importance of each variable. [15 marks]
(i) Download the AggWaFit.R le (from CloudDeakin) to your working directory and
load into the R workspace using,
source(“AggWaFit718.R”)
(ii) Use the tting functions to learn the parameters for
Weighted arithmetic mean (WAM),
Weighted power means (PM) with p = 0:5, and p = 2,
Ordered weighted averaging function (OWA), and
Choquet integral.
(iii) Include two tables in your report – one with the error measures (RMSE, Av.abs error,
Pearson correlation, Spearman correlation) and one summarising the weights/parameters
that were learned for your data.
SIT718 Assignment 2018 { T1 4 of 7
(iv) Compare and interpret the data in your tables. Be sure to comment on:
(a) How good the model is,
(b) The importance of each of the variables (the four variables that you have selected),
(c) Any interaction between any of those variables (are they complementary or redun-dant?)
(d) better models favour higher or lower inputs (1-2 paragraphs for part (iv)).
4. Use your model for prediction. [10 marks]
(i) Using your best tting model, predict the Heating Load Y1 or the Cooling Load Y2
for the following input:
X1=0.82, X2=612.5, X3=318.5, X4=147, X5=7.
Give your result and comment on whether you think it is reasonable. (1-2 sentences)
(ii) Comment generally on the ideal conditions (in terms of your 4 variables) under which
a low heating or cooling load will occur. (1-2 sentences)
For this part, your submission should include:
1. A report (created in any word processor), covering all of the items in above. With plots
and tables it should only be 2 – 3 pages.
2. A data le named \name-transformed.txt” (where `name’ is replaced with your name -you can use your surname or rst name – just to help us distinguish them!).
3. R code le, (that you have written to produce your results) named “name-code.R”, where
name is your name;
SIT718 Assignment 2018 { T1 5 of 7
Part B: Optimisation
1. A food factory is making a special Juice for a customer from mixing two di
erent existing
products JA and JB. The compositions of JA and JB and prices ($/l) are given as follows,
Amount (l) in /100 l of JA and JB
Carrot Orange Apple Cost ($/l)
JA 4 6 3 6
JB 8 3 6 5
The customer requires that there must be at least 3.5 litres Orange and at least 4 litres
of Apple concentrate per 100 litres of the Juice respectively, but no more than 6 litres of
Carrot concentrate per 100 litres of Juice. The customer needs at least 50 litres of Juice
per week.
a) Formulate a Linear Programming (LP) model for the factory that minimises the total
cost of producing the Juice while satisfying all constraints.
b) Use the graphical method to nd the optimal solution. Show the feasible region and
the optimal solution on the graph. Annotate all lines on your graph. What is the mini-mal cost for the product?
[25 marks]
SIT718 Assignment 2018 { T1 6 of 7
2. A factory makes three products (fabrics): Summer, Autumn, and Winter from three
materials containing: Cotton, Wool and Viscose. The following table provides details
on the sales price, production cost and purchase cost per ton of products and materials
respectively.
Sales price Production cost Purchase price
Summer $50 $4 Cotton $30
Autumn $55 $4 Wool $45
Winter $60 $5 Viscose $40
The maximal demand (in tons) for each product, the minimum cotton and wool propor-tion in each product is as follows.
Demand min Cotton proportion min Wool proportion
Summer 4500 60% 30%
Autumn 4000 60 % 30%
Winter 3800 40% 50%
Formulate a LP model for the factory that maximises the prot, while satisfying the
demand and the cotton and wool proportion constraints.
Solve the model using IBM ILOG CPLEX. What are the optimal prot and optimal
values of the decision variables?
Hints:
1. Let x
ij 0 be a decision variable that denotes the number of tons of products j
for j 2 f1 = Summer; 2 = Autumn; 3 = Winterg to be produced from Materials
i 2 fC=Cotton, W=Wool, V=Viscoseg.
2. The proportion of a particular type of Material in a particular type of Product can be
calculated as:
e.g., the proportion of Cotton in product Summer is given by:
x
C;1
x
C;1 + x
W;1 + x
V;1
.
SIT718 Real world Analytics
Order 100% plagiarism free essay
[25 marks]
Submission
Submit to the SIT718 Clouddeakin Dropbox.
Combine the report from part A and the Solutions from part B in ONE pdf le. Copy and
paste your CLEX code to Solutions for Part B. Label the le with name.pdf, where `name’ is
replaced with your name – you can use your surname or rst name – to help distinguish them!).
Your nal submission should consist of no more than 4 les:
SIT718 Assignment 2018 { T1 7 of 7
1. One pdf le (created in any word processor), containing the report of Part A, the Solutions
of the two questions of Part B, including CPLEX code, labelled with your name. This
le should be no more than 5-6 pages.;
2. A data le named \name-transformed.txt” (where `name’ is replaced with your name;
3. A code with your R le, labelled with your name.R;
4. A code with your CPLEX le, labelled with your name.mod, also copy the code in your
solution document.