Zoos with Great Pandas in their exhibitions, as pr. medio March ’19.

And Copenhagen Zoo, which will get their pandas in april.

Zoos with Great Pandas in their exhibitions, as pr. medio March ’19.

And Copenhagen Zoo, which will get their pandas in april.

One of our users need to find the max-value of a variable. He also needs to find the corresponding value in another variable.

As in – the maximum value in column A is in row 42. What is the value in column B, row 42.

And of course we need to do it for several groups.

Let us begin by making a dataset. Four groups in id,

```
library(tidyverse)
id <- 1:3
val <- c(10,20)
kor <- c("a", "b", "c")
example <- expand.grid(id,val) %>%
as_tibble() %>%
arrange(Var1) %>%
cbind(kor, stringsAsFactors=F) %>%
rename(group=Var1, value=Var2, corr = kor)
example
```

## group value corr ## 1 1 10 a ## 2 1 20 b ## 3 2 10 c ## 4 2 20 a ## 5 3 10 b ## 6 3 20 c

We have six observations, divided into three groups. They all have a value, and a letter in “corr” that is the corresponding value we are interested in.

So. In group 1 we should find the maximum value 20, and the corresponding value “b”.

In group 2 the max value is stil 20, but the corresponding value we are looking for is “a”.

And in group 3 the max value is yet again 20, but the corresponding value is now “c”.

How to do that?

```
example %>%
group_by(group) %>%
mutate(max=max(value)) %>%
mutate(max_corr=corr[(value==max)]) %>%
ungroup()
```

## # A tibble: 6 x 5 ## group value corr max max_corr ## <int> <dbl> <chr> <dbl> <chr> ## 1 1 10. a 20. b ## 2 1 20. b 20. b ## 3 2 10. c 20. a ## 4 2 20. a 20. a ## 5 3 10. b 20. c ## 6 3 20. c 20. c

The maximum value for all groups is 20. And the corresponding value to that in the groups is b, a and c respectively.

Isn't there an easier solution using summarise function? Probably. But our user needs to do this for a lot of variables. And their names have nothing in common.

One can only hope that the concept “Digital Natives” will soon be laid to rest. Or at least all the ideas about what they can do.

A digital native is a person that grows up in the digital age, in contrast to digital immigrants, that got their familiarity with digital systems as an adult.

And there are differences. Digital natives assumes that everything is online. Stuff that is not online does not exist. Their first instinct is digital.

However, in the library world, and a lot of other places, the idea has been, that digital natives, because they have never experienced a world without computers, groks them. That they just know how to use them, and how to use them in a responsible and effective way.

That is, with a technical term, bovine feces. And for far too long, libraries (and others) have ignored the real needs, assuming that there was now suddenly no need for instruction in IT-related issues. Becase digital natives.

Being a digital native does not mean that you know how to code.

Being a digital native does not mean that you know how to google efficiently.

Being a digital native does not mean that you are magically endowed with the ability to discern fake news from facts.

I my self is a car native. I have grown up in an age where cars were ubiquitous. And I still had to take the test twice before getting my license. I was not able to drive a car safely, just because I have never known a world without cars. Why do we assume that a digital native should be able to use a computer efficiently?

For many years, from 1977 to 2006, there was a regular feature in the journal for the Danish Chemical Society. “Kemiske småforsøg”, or “Small chemical experiments”. It was edited by the founder of the Danish Society for Historical Chemistry, and contained a lot of interesting chemistry, some of it with a historical angle.

The Danish Society for Historical Chemistry is considering collecting these experiments, and publishing them. It has been done before, but more experiments were published after that.

We still don’t know if we will be allowed to do it. And it is a pretty daunting task, as there are several hundred experiments. But that is what I’m spending my free time on at the moment. If we get i published, it will be for sale at the website of the Danish Society for Historical Chemistry.

We’re looking at Pythagorean triplets, that is equations where a, b and c are integers, and:

a^{2} + b^{2} = c^{2}

The triangle defined by a,b,c has a perimeter.

The triplet 20,48,52 fulfills the equation, 20^{2} + 48^{2} = 52^{2.} And the perimeter of the triangle is 20 + 48 + 52 = 120

Which perimeter p, smaller than 1000, has the most solutions?

So, we have two equations:

a^{2} + b^{2} = c^{2}

p = a + b + c

We can write

c = p – a – b

And substitute that into the first equation:

a^{2} + b^{2} = (p – a -b)^{2}

Expanding the paranthesis:

a^{2} + b^{2} = p^{2} – ap – bp – ap + a^{2} + ab – bp + ab + b^{2}

Cancelling:

0 = p^{2} – 2ap – 2bp + 2ab

Isolating b:

0 = p^{2} – 2ap – b(2p – 2a)

b(2p – 2a) = p^{2} – 2ap

b = (p^{2} – 2ap)/(2p – 2a)

So. For a given value of p, we can run through all possible values of a and get b. If b is integer, we have a solution that satisfies the constraints.

The smallest value of a we need to check is 1. But what is the largest value of a for a given value of p?

We can see from the pythagorean equation, that a =< b < c. a might be larger than b, but we can then just switch a and b. So it holds. What follows from that, is that a =< p/3.

What else? If a and b are both even, a^{2} and b^{2} are also even, then c^{2} is even, and then c is even, and therefore p = a + b + c is also even.

If a and b are both uneven, a^{2} and b^{2} are also uneven, and c^{2} is then even. c is then even. And therefore p = a + b + c must be even.

If either a or b are uneven, either a^{2} or b^{2} is uneven. Then c^{2} is uneven, and c is then uneven. Therefore p = a + b + c must be even.

So. I only need to check even values of p. That halves the number of values to check.

Allright, time to write some code:

```
current_best_number_of_solutions <- 0
for(p in seq(2,1000,by=2)){
solutions_for_current_p <- 0
for(a in 1:ceiling(p/3)){
if(!(p**2-2*a*p)%%(2*p-2*a)){
solutions_for_current_p <- solutions_for_current_p + 1
}
}
if(solutions_for_current_p > current_best_number_of_solutions){
current_best_p <- p
current_best_number_of_solutions <- solutions_for_current_p
}
}
answer <- current_best_p
```

current_best_number_of_solutions is initialized to 0.

For every p from 2 to 1000, in steps of 2 (only checking even values of p), I set the number of solutions_for_current_p to 0.

For every value a from 1 to p/3 – rounded to to an integer: If !(p**2-2*a*p)%%(2*p-2*a) is true, that is, if the remainder of (p**2-2*a*p)/(2*p-2*a) is 0, I increment the solutions_for_current_p.

After running through all possible values of a for the value of p we have reached in the for-loop:

If the number of solutions for this value of p is larger, than the previous current_best_number_of_solutions, we have found a value of p that has a higher number of solutions than any previous value of p we have examined. In that case, set the current_best_p to the current value of p. And the current_best_number_of_solutions to the number of solutions we have found for the value of p.

If not, dont change anything, reset solutions_for_current_p and check a new value of p.

A palindromic number is similar to a palindrome. It is the same read both left to right, and right to left.

Project Euler tells us, that the largest palindrom made from the product of two 2-digit numbers is 9009. That number is made by multiplying 91 and 99.

I must now find the largest palindrome, made from the product of two 3-digit numbers.

What is given, is that the three digit numbers cannot end with a zero.

There are probably other restrictions as well.

I’ll need a function that tests if a given number is palindromic.

```
palindromic <- function(x){
sapply(x, function(x) (str_c(rev(unlist(str_split(as.character(x),""))), collapse="")==as.character(x)))
}
```

The function part converts x to character, splits it in individual characters, unlists the result, reverses that, and concatenates it to a string. Then it is compared to the original x – converted to a character.

The sapply part kinda vectorises it. But it is still the slow part.

If I could pare the number of numbers down, that would be nice.

One way would be to compare the first and last digits in the number.

```
first_last <- function(x) {
x %/% 10^(floor(log10(x))) == x%%10
}
```

This function finds the number of digits – 1 in x. I then modulo-divide the number by 10 to the number of digits minus 1. That gives me the first digit, that I compare with the last. If the first and the last digit is the same – it returns true.

Now I am ready. Generate a vector of all three-digit numbers from 101 to 999. Expand the grid to get all combinations. Convert to a tibble,

filter out all the three-digit numbers that end with 0. Calculate a new column as the multiplication of the two numbers, filter out all the results where the first and last digit are not identical, and then filter out the results that are not palindromic. Finally, pass it to max (using %$% to access the individual variables), and get the result.

```
library(dplyr)
library(magrittr)
res <- 101:999 %>%
expand.grid(.,.) %>%
as_tibble() %>%
filter(Var1 %% 10 != 0, Var2 %% 10 != 10) %>%
mutate(pal = Var1 * Var2) %>%
filter(first_last(pal)) %>%
filter(palindromic(pal)) %$%
max(pal)
```

There are probably faster ways of doing this…

I have a love-hate relationship with visions and missions. The ones that companies and organizations spend a lot of time and ressources on developing. They often have a rather formulaic form:

We exist to restore intellectual capital whilst continuing to collaboratively coordinate information.

This is actually a mission statement from a mission statement generator.

I do love the idea of missions, visions and strategies. It speaks to the engineer in me. We should define the goal we want to achieve, and then break that goal down into individual work-packages. When we have completed all of them, we have achieved our goal. The logical framework approach is a good example.

I also like values. I need consistency. Or, I don’t need it, but it is important to me. I like people and organizations to actually be consistent in their actions. Walk the talk! If you want a work environment that does not discriminate – do not discriminate. Anyone. In any way, I do not really mind that you discriminate based on gender. Go ahead. Just be honest about it. In reality, I will of course hate you if you discriminate based on gender. But the hate will take a more deep and incandecent nature if you discriminate based on gender, while claiming that you are all about equality.

So – I love strategies. I love visions. I love missions. I love values.

On the other hand. I hate them. Most of them are exactly like the example above. An example that is taken from the mission statement generator. Noone is able to explain the difference between mission and vision, and the values all falls for the negation test. And tend to end up being rather tautological. Often they are self contradictory. You can have loyalty in all situations. Or you can have honesty in all situations. But you can’t have both. Sometimes being loyal means holding back on honesty a bit. How do you prioritize your values? I have never seen a description of a hierarchy.

Usually it doesn’t matter. Most values in organizations tend to be nothing more than hot air. And as hot air, easy to dismiss when it is opportune. Try it yourself. Tell your boss that her decision is wrong, because it is in conflict the the defined values of the organization.

So – I love values. But not when they are meaningless.

And if they are – don’t bother defining them. They are just going to be a waste of time.

Not that advanced, but I wanted to play around a bit with plotting the raw data from Openstreetmap.

We’re going to Florence this fall. It’s been five years since we last visited the fair city, that has played such an important role in western history.

Openstreetmaps is, as the name implies, open.

I’m going to need some libraries

```
#library(OpenStreetMap)
library(osmar)
library(ggplot2)
library(broom)
library(geosphere)
library(dplyr)
```

osmar provides functions to interact with Openstreetmap. ggplot2 is used for the plots, broom for making some objects tidy and dplyr for manipulating data.

Getting the raw data, requires me to define a boundary box, encompassing the part of Florence I would like to work with. Looking at https://www.openstreetmap.org/export#map=13/43.7715/11.2717, I choose these coordinates:

```
top <- 43.7770
bottom <- 43.7642
left <- 11.2443
right <- 11.2661
```

After that, I can define the bounding box, tell the osmar functions at what URL we can find the relevant API (this is just the default). And then I can retrieve the data via get_osm(). I immediately save it to disc. This takes some time to download, and there is no reason to do that more than once.

```
box <- corner_bbox(left, bottom, right, top)
src <- osmsource_api(url = "https://api.openstreetmap.org/api/0.6/")
florence <- get_osm(box, source=src)
saveRDS(florence, "florence.rda")
```

Lets begin by making a quick plot:

```
plot(florence, xlim=c(left,right),ylim=c(bottom,top) )
```

Note that what we get a plot of, is, among other things, of all lines that are partly in the box. If a line extends beyond the box, we get it as well.

Looking at the data:

```
summary(florence$ways)
```

## osmar$ways object ## 6707 ways, 9689 tags, 59052 refs ## ## ..$attrs data.frame: ## id, visible, timestamp, version, changeset, user, uid ## ..$tags data.frame: ## id, k, v ## ..$refs data.frame: ## id, ref ## ## Key-Value contingency table: ## Key Value Freq ## 1 building yes 4157 ## 2 oneway yes 456 ## 3 highway pedestrian 335 ## 4 highway residential 317 ## 5 bicycle yes 316 ## 6 psv yes 122 ## 7 highway unclassified 108 ## 8 highway footway 101 ## 9 barrier wall 98 ## 10 surface paving_stones 87

I would like to plot the roads and buildings. For some reason there are a lot of highways, of a kind I would probably not call highways.

Anyway, lets make a list of tags. tags() finds the elements that have a key in the tag_list, way finds the lines that are represented by these elements, and find, finds the ID of the objects in “florence” matching this.

find_down() finds all the elements related to these id’s. And finally we take the subset of the large florence data-set, which have id’s matching the id’s we have in from before.

```
tag_list <- c("highway", "bicycle", "oneway", "building")
dat <- find(florence, way(tags(k %in% tag_list)))
dat <- find_down(florence, way(dat))
dat <- subset(florence, ids = dat)
```

Now, in a couple of lines, I’m gonna tidy the data. That removes the information of the type of line. As I would like to be able to color highways differently from buildings, I need to keep the information.

Saving the key-part of the tags, and the id:

```
types <- data.frame(dat$ways$tags$k, dat$ways$tags$id)
names(types) <- c("type", "id")
```

This gives me all the key-parts of all the tags. And I’m only interested in a subset of them:

```
types <- types %>%
filter(type %in% tag_list)
types$id <- as.character(types$id)
```

Next as_sp() converts the osmar object to a spatial object (just taking the lines):

```
dat <- as_sp(dat, "lines")
```

tidy (from the library broom), converts it to a tidy tibble

```
dat <- tidy(dat)
```

That tibble is missing the types – those are added.

```
new_df <- left_join(dat, types, by="id")
```

And now we can plot:

```
new_df %>%
ggplot(aes(x=long, y=lat, group=group)) +
geom_path(aes(color=type)) +
scale_color_brewer() +
xlim(left,right) +
ylim(bottom,top) +
theme_void() +
theme(legend.position="none")
```

Nice.

Whats next? Someting like what is on this page: https://github.com/ropensci/osmplotr

This probably sounds like humble bragging. But I have recently – again – reailized that my biggest weakness is that I take responsibility.

Hey! How is that a weakness?

Well… It becomes a weakness when you continually take responsibility for stuff that is really not your responsibilty. To the extent that you get stress, hypertension and ulcers. And to the extent that it impacts negatively on the things that actually are your responsibility.

And I have just done it again. The ad for a meeting in the local party is not very readable. That is not my responsibility. It belongs to the chairman. Not me. I should simply notify him that it is not very readable. And trust that he will do something about it. Instead I am thinking about remaking it myself. It would not be very difficult. But I do get stressed. If I have to redesign the ad, I wont have time to cook dinner tonight. And clean the house.

This is something that I really have to get better at handling. Otherwise I’ll be a very responsible person, doing great things for people and organizations around me. While burning out very fast.