The dataset used is the Wisconsin breast cancer data set from the scikit-learn sample data collection :
- Classes: 2
 - Samples per class- 212(M),357(B)
 - Samples total- 569
 - Dimensionality- 30
 - Features- real, positive
 
Description of each of the 10 parameters of the cell nuclei :
- radius : radius of an individual nucleus. It is measured by averaging length of the radial line segments
 - texture : measured by finding the variance of grey scale intensities in the component pixels
 - perimeter : total distance between the snake points
 - area : measured by counting the number of pixels on the interior of the snake and adding one half of pixels in the perimeter
 - smoothness : calculated by measuring the difference between length of a radial line and mean length of lines surrounding it
 - compactness : given by combining perimeter and area of cell nuclei by using formula (perimeter)^2/area
 - concavity : measure of number and severity of concavities or indentations in a cell nucleus.
 - concave points : measures the number rather than magnitude of contour concavities
 - symmetry : measured by calculating length differences between lines perpendicular to major axis or longest hord through center to the cell boundary in both directions
 - fractal dimension: is calculated using the coastline approximation by calculating “coastline approximation” - 1
 
Below are the list of works that have been as a part fo this project :
- Applied two classifier models, Decision Trees and Support Vector Machines to classify breast cancer from a set of characteristics of the cell nuclei in an image of a fine needle aspirate of a breast mass.
 - Compared the two different classifiers and used hyper parameter optimisation and scatter plots for observation.