In order to do this, C4. A classifier is a tool in data mining that takes a bunch of data representing things we want to classify and attempts to predict which class the new data belongs to.

Sure, suppose a dataset contains a bunch of patients. We know various things about each patient like age, pulse, blood pressure, VO 2 max, family history, etc.

These are called attributes. Given these attributes, we want to predict whether the patient will get cancer. The patient can fall into 1 of 2 classes: Decision tree learning creates something similar to a flowchart to classify new data.

Using the same patient example, one particular path in the flowchart could be:. At each point in the flowchart is a question about the value of some attribute, and depending on those values, he or she gets classified.

You can find lots of examples of decision trees. Is this supervised or unsupervised? This is supervised learning, since the training dataset is labeled with classes.

Using the patient example, C4. We told it first, it generated a decision tree, and now it uses the decision tree to classify. Arguably, the best selling point of decision trees is their ease of interpretation and explanation.

They are also quite fast, quite popular and the output is human readable. Where is it used? A popular open-source Java implementation can be found over at OpenTox.

Orange , an open-source data visualization and analysis tool for data mining, implements C4. Cluster analysis is a family of algorithms designed to form groups such that the group members are more similar versus non-group members.

Clusters and groups are synonymous in the world of cluster analysis. Is there an example of this? Definitely, suppose we have a dataset of patients.

In cluster analysis, these would be called observations. We know various things about each patient like age, pulse, blood pressure, VO 2 max, cholesterol, etc.

This is a vector representing the patient. You can basically think of a vector as a list of numbers we know about the patient.

This list can also be interpreted as coordinates in multi-dimensional space. Pulse can be one dimension, blood pressure another dimension and so forth.

Given this set of vectors, how do we cluster together patients that have similar age, pulse, blood pressure, etc?

How does k-means take care of the rest? It depends, but most would classify k-means as unsupervised. The key selling point of k-means is its simplicity.

Two key weaknesses of k-means are its sensitivity to outliers, and its sensitivity to the initial choice of centroids. A ton of implementations for k-means clustering are available online:.

Support vector machine SVM learns a hyperplane to classify data into 2 classes. At a high-level, SVM performs a similar task like C4.

In fact, for a simple classification task with just 2 features, the hyperplane can be a line. SVM can perform a trick to project your data into higher dimensions.

Once projected into higher dimensions…. Do you have an example? Absolutely, the simplest example I found starts with a bunch of red and blue balls on a table.

When a new ball is added on the table, by knowing which side of the stick the ball is on, you can predict its color. What do the balls, table and stick represent?

However, there are other more exotic examples of totally ordered sets. For instance a classroom X of children is totally ordered by height provided that no two children in X have identical height.

However, not all orders are total. A poset short for partially ordered set is a set with a partial order. For another example, let X be the following set of movies: The Star Wars Holiday Special is often considered one of the worst movies of all time, whereas The Godfather is considered one of the best.

However, in my opinion, The Matrix and Shrek are incomparable. Recall that a DAG , or a directed acyclic graph , is a collection of vertices connected by arrows with no cycles, i.

First, we can partially order the vertex set of any DAG D. For each element x in X , we draw an arrow to every y in X which covers x. In the finite case, these operations are mutual inverses, yielding a one to one correspondence between finite DAGs and finite posets.

Thus every finite DAG is essentially a finite poset, and vice versa, allowing us to freely confuse the concepts. The IOTA protocol gives priority to larger transactions: For instance, the genesis is both the largest or maximum element in T and also the most important of all transactions.

However, we still have a problem: In this case, we need to decide which transaction is more important. IOTA solves this problem by using the confidence level to totally order T: For example, if transactions x and y spend the same money from the same account, then they conflict.

We make one caveat: Partial orders also appear in traditional blockchains. In a blockchain, transactions are ordered by where they appear on the longest chain.

However, since any transaction not on the main chain is ignored, they are placed in a negatively infinite position. If two transactions potentially conflict, then a DLT protocol needs to prioritize them in some fashion.

However, priority is a partial order on the set of transactions. Thus all nodes in a DLT must reach consensus on the partial order of transactions.

This story is similar to the equivalence of classical consensus to atomic broadcast. A protocol in a distributed system achieves atomic broadcast if it establishes amongst the nodes an ordered log of messages.

Computer scientists have shown that consensus and atomic broadcasts are equivalent problems. Establishing a partial order on transactions appears to be the DLT equivalent of atomic broadcast.

So how do posets help us? First, posets are rich and well-studied mathematical objects. Thus we have an opportunity to import language, algorithms, and theorems from the world of posets to the world of DLTs.

Considering a problem from multiple perspectives often reveals innovated solutions. Posets also help us understand how a specific DLT reaches consensus by analyzing how it orders transactions.

This analysis can highlight security weaknesses: Posets and Consensus was originally published in IOTA on Medium, where people are continuing the conversation by highlighting and responding to this story.

