My biggest weakness?

This probably sounds like humble bragging. But I have recently – again – reailized that my biggest weakness is that I take responsibility.

Hey! How is that a weakness?

Well… It becomes a weakness when you continually take responsibility for stuff that is really not your responsibilty. To the extent that you get stress, hypertension and ulcers. And to the extent that it impacts negatively on the things that actually are your responsibility.

And I have just done it again. The ad for a meeting in the local party is not very readable. That is not my responsibility. It belongs to the chairman. Not me. I should simply notify him that it is not very readable. And trust that he will do something about it. Instead I am thinking about remaking it myself. It would not be very difficult. But I do get stressed. If I have to redesign the ad, I wont have time to cook dinner tonight. And clean the house.

This is something that I really have to get better at handling. Otherwise I’ll be a very responsible person, doing great things for people and organizations around me. While burning out very fast.

Yet another idea for a project

Library of Congress has dumped a lot (like a LOT) of data on the net:

https://catalog.data.gov/dataset?q=organization:library-of-congress+AND+type:dataset&publisher=Library+of+Congress

Including their catalogue. I was wondering… Would it be possible to build a machine learning algorithm, that returns a subject, based on title? And maybe other information?

Something to look into during the long hours in the summer, where the boss is on holiday, the patrons are away, and we have time to do interesting stuff? Not that we are not doing interesting stuff already, but stuff that is interesting in it self.

You have to run very fast to stay in the same place

In a world, changing with increasing speed, it is simply human nature to try to make things stay the way they have always been. We don’t like change. And we are prepared to do a lot of work to prevent the change. Even if it is inevitable.

Because of this, forward thinking managers, not only in libraries, are spending a lot of time changing things. That is usually a good thing.

But sometimes it is not. The idea that we should change, adopt, develop and reform things, because if we don’t the world around us will change, can often lead to spending a lot of time running forward in circles. And not necessarily forward. We are not always certain that a change is actually for the better, but at least it is a change, and we have to change don’t we?

So – one of the cynical lessons from someone who has spent the last 20+ years observing projects, is that far too often, changes are made based on the assumption that change in itself is good. Or:

  1. To improve things, things must change
  2. We are changing things
  3. Therefore, we are improving things.

Before you spend my tax money on changing things, please pause and consider if the change is actually for the better. Or just a change.

Competence development

What is it?

We spend a lot of time talking about it. And to be honest, in my opinion, not very much time actually doing anything about it.

Competence is the ability to do something successfully or efficiently.

It follows logically, that when you develop your competences, you will, afterwards, be able to do something (successfully) that you were not able to do before. Or you will be able to do it more efficiently, ie better or faster.

Usually it is not enough to acquire a skill. “Use it or lose it” they say. I was actually speaking pretty good german when I was 15. Today I’m anything but fluent, and struggling with Duolingo to regain the skills I had 28 years ago. I do read technical german pretty well. And that testifies to the fact, that I have used the part of my “german competences” that relates to technical texts. Whereas I have not used the part that allows me to speak or understand spoken, german.

It’s the same with the skills you need at your job. You may acquire some skills at a course, by reading a book, by getting peer-training. But if you don’t use them afterwards, you loose them.

Far too often we spend time attending a course, get back to work, and not use the things we learned. The net result is that we wasted the time and money spend on the course. Noone made sure that we had the time to use our new skills. Or that procedures in the organization was changed accordingly. Maybe we learned skills that we don’t really need.
It would be fun to learn to weld. And even better if my boss would pay for it. But I would never get to use that skill at work. Or at home. So it would be a bloody waste of time! It would probably be bloody in a literal sense as well…

That is one problem. Developing skills and competences that are not actually useful. Or used. I have had it happen to me several times. I’ve even paid for useless courses and activities out of my own pocket.

The other problem is labelling.

We all agree that competence development is positive. We want more of it. It is nice, and necessary. Therefore it is not surprising, that we would like to label activities that does not really develop any competences, as developing competences. I have been required to register, as competence development, giving an introduction to ebook readers. I agree that the participants in the course, the colleagues subjected to me talking about ebook readers for 1½ hours, had their competences developed. Or at least I hope they had.

But how is giving that introduction developing my competences? As an academic I am able to argue for, or againts, anything, sometimes even at the same time. But my competences regarding ebook readers were definitely not developed. I might have gotten slightly better at giving introductions. But I did not learn anything new about ebook readers.

Why do I care? Well. First of all I abhor newspeak. If something is not developing competences, dont claim that that is what is happening. It might be interesting. It might be beneficial. But if competences are not developed, they are not developed, and please do not claim that they were.

Secondly, it generates the impression that we are spending a lot of resources on developing competences. That is a problem if that impression is false. We will wake up some day, and wonder why we did not learn what was necessary to survive in a changing library landscape. It will be a mystery to us, because we thought we were spending a lot of time developing all sorts of competences. But in reality we did not develop anything.

 

 

 

Euler 100

Project Euler – problem 100

Back to the hopeless examples of probabilities from school.
In a bag there are 15 black balls and six white ones. Project Euler talks about discs, math-teachers has always used balls as examples, and they where always white and black. So I’ll stick with that.

It you draw two balls from the bag, there is a 50/50 chance of drawing 2 black balls:

(15/21)*(14/20)
## [1] 0.5

I’m told that the next set of balls in the bag with that property, is 85 black balls and 35 white ones:
(85/120)

(85/120)*(84/119)
## [1] 0.5

Find the mix of black and white balls, that gives a probability of 50/50 of drawing 2 black balls, given that there should be more than 10¹² = 1000000000000 balls in the rather large bag.

That should be straight-forward.
Lets call the number of black balls b and the number of white balls w. And lets define the total number of balls in the back as n=w+b
The probability of drawing two black balls is:

(b/n)((b-1)/(n-1)) = ½

n = w + b > 10¹²

Two equations with two unknowns.
The probability can be rearranged:

(b/n)((b-1)/(n-1)) = ½ <=>

b(b-1) / n(n-1) = ½ <=>

(b² – b) / (n² – n) = ½ <=>

b² – b = ½(n² – n) <=>

2b² – 2b = n² – n <=>

2b² – 2b – n² + n = 0

Hm. Maybe it is not that simple after all. First of all I don’t know if n is 100000000000 or 100000000001. That actually makes a pretty big difference:

1000000000001**2 - 1000000000000**2
## [1] 1.999978e+12

Second of all, I need to find integer solutions. An analytical solution might not give integer results. And I can’t have one third of a ball in the bag.

Googling “finding integer solutions to equations” give, as the first result, a link to the wikipedia article on “Diophantine equations”.
Which apparently are equations that should have integer solutions.

All right, a couple of the problems I’ve tackled earlier, and quite a lot of Project Euler problems I’ve given up on appears to be about solving these Diophantine equations.

So. Nice. The last link of the wikipedia page is to https://www.alpertron.com.ar/QUAD.HTM.
I should probably read up on the methods. But that will have to wait.

The point is, that this Diophantine equation can be solved by:

b~n+1~ = 3b~n~ + 2n~n~ -2

n~n+1~ = 4b~n~ + 3n~n~ -3

The idea is that we have a solution (b~n~, n~n~). And these two equations allows us to calculate the next solution, (b~n+1~, n~n+1~)

Lets try that, we was given that (15,21) was a solution. The next should be (85,120). Do we get that?

b <- 15
n <- 21
b_n <- 3*b + 2*n -2
n_n <- 4*b + 3*n -3
print(paste(b_n, n_n, sep=","))
## [1] "85,120"

Qap’la, it works. Nice. Now I just need to run through this until n~n+1~ gets above 10¹².

b <- 15
n <- 21
while(n<10**12){
  b_n <- 3*b + 2*n -2
  n_n <- 4*b + 3*n -3
  b <- b_n
  n <- n_n
}
answer <- b

Lessons learned:

  1. Solving Diophantine equations is at the heart of a lot of these problems. I’ve learned a new tools to handle them!
  2. If you want to subscript stuff in RMarkdown, you place a ~ on each side of what you want subscripted.

Other stuff to note: Maybe it is time someone wrote a new solver for Diophantine equations. The one I found is 19 years old. Something to do in Shiny perhaps?

Copy rows to another sheet – based on cell-values

And handling images while you’re at it.

Given: We have some data in a sheet – lets call it Source. Based on some values in another sheet – lets call that Condition – we want to copy rows from Source to a third sheet. We’ll call that Target.
To complicate things, we want to copy images as well.

Set some variables to Target, Source and Condition.
Delete the content of the existing target sheet. First alle the images, and then the rest. Note that I’m not deleting everything, just from row 6 and down.
Then for each something (d) in column B (adjust ranges – here I’m only looking at the rows from 2 to 9), check if the relevant row in Source matches, then copy to Target.

There’s a small detail here, I needed to insert an identifier in Target, defined by a value in Condition. Instead of trying to insert in Column B, I’m just searching and replacing a placeholder – “£$”, a string I was pretty certain would not show up anywhere.


Sub CopyYes()
Dim c As Range
Dim j As Integer
Dim Source As Worksheet
Dim Target As Worksheet
Dim Condition As Worksheet
Dim k As String
Dim fnd As Variant
Dim rplc As Variant
fnd = "£$"

Set Source = ActiveWorkbook.Worksheets("Ark4") 'Note that ranges in Souce and Condition below should be adjusted. We're not quite there yet.
Set Target = ActiveWorkbook.Worksheets("Ark3")
Set Condition = ActiveWorkbook.Worksheets("Ark1")

' Start by clearing target sheet
' begin with images
Target.Pictures.Delete
' Then we'll delete the rest
'
With Target
.Rows(6 & ":" & .Rows.Count).Delete
End With

j = 7 'This will start copying data to Target sheet at row 1
For Each d In Condition.Range("B2:B9") 'Ark1
k = d.Offset(0, -1)
rplc = k
For Each c In Source.Range("B2:B52") 'Ark2
If d = c Then
Source.Rows(c.row).Copy Target.Rows(j)
j = j + 1
End If
Next c
Target.Cells.Replace what:=fnd, Replacement:=rplc

Next d
'we'll end by hiding some columns
Target.Columns("A:E").Hidden = True
End Sub

Replacing values in a dataframe – to what a previous value was

Given a set of data, where some values indicate that they are the same as a previous value, how to replace them with the correct value.

Eg, this dataframe:

(m <- data.frame(i=c(1:10,NA), t=c("lorem", "do", "do", "Do", "ipsum", "do", "Do", "(do)", "dolor", NA, "test"), stringsAsFactors=F))
##     i     t
## 1   1 lorem
## 2   2    do
## 3   3    do
## 4   4    Do
## 5   5 ipsum
## 6   6    do
## 7   7    Do
## 8   8  (do)
## 9   9 dolor
## 10 10  <NA>
## 11 NA  test

How to replace the first three “do”s with “lorem” and the next set of “do”s with “ipsum”

Using fill() from the tidyr package is straight forward. It takes a vector, locates all NA, and replaces them with the last, non-NA value.
Simple enough, change all the variations of “do” to NA, run fill(). Done.
One problem, there might be NAs in the dataset, that we do not want to affect.
Solution – there might be a more elegant one, but this works:

  1. Change the NAs to something that do not occur in the data
  2. Change to variations of “do” to NA
  3. Use the fill()-function
  4. Change the NAs from step 1 back to NA
library(tidyr)
rpl <- "replacement"
m[is.na(m$t),]$t <- rpl
doset <- c("do", "Do", "(do)")
m[(m$t %in% doset),]$t <- NA

m <- m %>% fill(t)
m[(m$t == rpl),]$t <- NA
m
##     i     t
## 1   1 lorem
## 2   2 lorem
## 3   3 lorem
## 4   4 lorem
## 5   5 ipsum
## 6   6 ipsum
## 7   7 ipsum
## 8   8 ipsum
## 9   9 dolor
## 10 10  <NA>
## 11 NA  test

Done!

Oh, and by the way, this is my first post generated directly from RStudio!

Euler problem 41

Project Euler is a pretty good way to exercise your programming muscles. I tend to think that the hardest part is usually the math.

So when I figure one of them out in minutes, I’m pretty happy about it.

library(gtools)
library(numbers)
pandig <- function(n){
x <- permutations(n,n)
x <- apply(x,1, function(x) paste(x, collapse=""))
return(x)
}
y <- as.numeric(pandig(7))
max(y[isPrime(y)])

The surprising thing is that largest pandigital prime does not begin with either 8 or 9.

Bitcoins and crypto currency

These days, when at least one crypto currency has tanked completely, Bitcoin is coming under increasing pressure from authorities, revelations that there are organized manipulation of the trading etc etc etc.

I am reminded of something that happened in the Netherlands in 1637.

Read about it here. I am continually baffled by our inability to learn from history.

Practical project management

Project management is not easy. A handfull of practical hints from ~20 years of experience:

  • Dont ask for more ressources. Ask for the ressources you were promised, but never got.
  • Dont ask for an extension on deadlines. Ask that the ones you got in the first place are not changed.
  • Never expect anyone to have heard of the iron triangle of project management.
    Even if they actually have heard of it before.
  • Never expect anyone to have any understanding of what the project is about.
    Even if they are in charge of managing it.
  • When, at the beginning of a project, estimations of the necessary ressources are deemed irrelevant by the steering commitee, buy antacids. You will get an ulcer.
  • When someone is given a budget – never expect them to stay within it.