2023/2024 Roadmap

As board, we talked to several members of the community about projects, efforts, and plans they have for Opencast in the near future, so that we could integrate them into a roadmap we would like to share with you.

Most of these projects are expected to be finished in early 2024 and will likely make it into Opencast 16. However, maybe one or two will make the Opencast 15 cut?

Auto-Update Metadata

We all know the situation: Someone changed a video title in Opencast and then complains that the player still shows the original title. The answer: Did you run the workflow to republish metadata? Users do not understand that saving a new title is not enough.

That said, is it even reasonable to expect users do understand the additional step required? Looking at how Opencast works and is being used, probably not. That is why we would like to make it possible for Opencast to automatically update the publication when you update metadata or access rights.

This should help users, make editing data faster and lead to fewer mistakes when updating metadata when editing data manually. It should also make integrations simpler since you don’t need to make sure to run additional steps and you no longer have a blocking component (workflow) for updates.

This project is driven by Osnabrück University. Our estimated timeline for this is to have it done by the end of the year.

Update To The Editor

Updating the editor is part of the ongoing crowdfunding campaign. It will fulfil some long-lasting needs: Finally, one will be able to up- and download subtitles in order to use them outside Opencast. Or edit them manually and re-upload them. Also, the “remove all segments” feature and the timeline zoom will become available.

These new features will also be accompanied by a number of fixes to bugs and other issues we have collected lately.

Lifecycle Management

This feature is driven by two requirements: One is to help lecturers use recordings more flexibly: Don’t publish them right away, but at the end of the semester. Or at a given time. Or with a delay to the recording date. In order to serve these requests efficiently and reliably, Opencast requires additional metadata and/or workflows. The second requirement comes from the other end of the publication: Please unpublish (or delete) these recordings at the end of the exam season. Or, from an institutional perspective: Unpublish all lecture recordings two years after they were produced. But inform lecturers in advance. And delete them from the archive after five years. Again, metadata and workflows will help you meet these needs.

Playlists

In Opencast, we already have series to group a bunch of videos and put them (for example) into a course. But this concept is fairly restrictive. What if I have an introduction video I want to put into several series? The answer right now: Upload it to Opencast again or clone it. That’s not a great solution.

This and more has let us believe that it is time to introduce the new concept of playlist to Opencast. Playlists are a list of videos, similar to the same concept on YouTube. This provides an n:m mapping, meaning that every playlist can contain multiple videos, and each video can be included in multiple playlists.

Part of the project is that it should be easy for integrations like LMS plugins (Moodle, Stud.IP, ILIAS) or video portals (Tobira, WordPress) to take these lists and render them.

The project is driven by TU Wien and Osnabrück University. We hope to have this finished in time for Opencast 16 in early 2024.

Integration of Whisper

The idea is to have an on-premise, open source transcription service in Opencast. This is an alternative to the existing integrations with AmberScript, Microsoft and others. As an on-premise solution, this is also more data protection compliant (keyword GDPR).

One part of the project is to have a Whisper integration for both GPU and CPU. This allows institutions – like us at the moment – without GPU server infrastructure to still easily use this feature.

Another aspect of this project is that lecturers should be able to initiate subtitling in different languages via the LMS (we use Moodle), e.g. by starting a corresponding workflow from the block plugin, uploading or downloading subtitles, deleting them again if necessary, or being able to post-process them via the Opencast Editor.

The basis should of course be that the subtitles are saved as tracks, so that it is possible for the lecturers to cut a video and thus also the subtitles, so that they always remain in sync.

Right now, we are testing openai-whisper and whipser-ctranslate2. The latter is optimized for CPU usage. There is already a pull request from Martin (ELAN e.V.) for improving the speech to text engine.

If there are other approaches, like the WhisperC++ pull request, that might be easier to integrate or where the creation of subtitles is faster, we would be happy to hear from you.

The project is driven by HU Berlin and with the support and developed by ELAN e.V. We hope to have this finished in time for Opencast 16 in early 2024.

Remove Solr and New Search

The Solr index is what powers your users’ ability to find videos. Our integration works, but has not been maintained in years, and it is starting to show. With the last round of crowdfunding, we decided to replace the existing Solr implementation with one based on Elasticsearch/Opensearch – the same technology powering the rest of the indexes in Opencast.

The basic skeleton of the required changes is already complete, however there is still work to be done, and extensive testing with real-world data is definitely something we want to do prior to pushing this out to adopters! Because of this we are pushing the target release from Opencast 15, to Opencast 16.

Thanks to funding from the University of Stuttgart, we expect this work to be complete sometime in December for inclusion early in the Opencast 16 release cycle.

Video Portal

The active development of Tobira is currently mostly about many incremental improvements, small new features, and bug fixes. A few notable plans include: the ability to set and modify access permissions (ACLs), to delete videos, and modify metadata of videos, and also to set access rights on pages.

Opencast 2023 Roadmap

As board, we talked to several members of the community about projects, efforts, and plans they have for Opencast in the near future, so that we could integrate them into a roadmap we would like to share with you.

Hello New Admin Interface

While we have been talking about it for a while, early 2023 should finally reveal the new admin interface.

But hold on… doesn’t it look kind of… identical? Yes, it does.

The main goal of this effort spearheaded by the University of Stuttgart is to replace the obsolete foundation we built on. Web technology has evolved over the years and what was state-of-the-art in 2014 when the current admin interface was conceived is remembered by few gray-bearded old developers today.

The new implementation allows us to fix some problems and makes it easier to build on the current foundation. This should hold true, especially for new developers.

Opencast Studio: Bug-bash

Opencast Studio was released in early 2020 and immediately saw very heavy usage during the pandemic.

But over the last couple of years, a number of minor issues came up and have been recorded in the issue tracker. These issues were not important enough for someone to immediately fix them, but they still contain lots of good ideas.

That is why we will review them and try to fix as many as possible in one big update.

Opencast Studio: Desktop Audio

Opencast Studio allows you to easily record your camera, desktop, and microphone. Unfortunately, desktop audio is missing. You want to play a short video and include that? No audio. You want to show someone how screen readers work? No audio.

In 2023, we would like to change that and again tackle the issue of recording desktop audio. This was hard with browsers in 2020, but technology has improved since then, and we hope that we can now make this happen.

Opencast Studio: Camera Blur

A lot of video conferencing tools allow users to automatically blur the background to hide their personal spaces.

We hope to transfer this technology to Opencast Studio in the first half of 2023 and include this in a later Opencast 13 or 14 release.

Tobira in Action

Due to the growing number of adoptions, early 2023 will be a lot about “refinement” for Tobira.

Specifically, there will be a focus on improving the design and accessibility across the entire application. But also more concrete features like the uploader offering more control over the content, or the video page giving more details and sharing options to the user.

Another area of focus is the search, which will incorporate more metadata like subtitles, for example. We are also thinking about how we can meaningfully integrate statistics for both, producers and consumers, and user-generated content.

Documentation Overhaul

One of the best things about Opencast 2 was the new documentation. No more reading through several wikis in the hope that somewhere someone wrote down something about what you want to know. Everything started to be at one place, you could easily switch between different versions and there was no longer an excuse for developers to not provide documentation alongside their patches. Not that an Opencast developer would ever not write nice documentation, of course…

Still, our documentation is getting old. Priorities have shifted. Functionality that has been purely optional in the past has become an integral part of Opencast. Some deployment options are mentioned but not properly explained. Some functions are no longer supported. You have all seen some of these problems pop up here and there.

That is why we set out to focus on a big overhaul of the Opencast documentation for Opencast 14. Having better documentation should help us all.

Goodbye and good riddance Solr

Solr is one of the search indexes used by Opencast. It is very old and has already been replaced in most places as part of last year’s crowdfunding. As part of that, we also built a prototype of a Solr-free search service, which is the last part using Solr.

While the prototype works, we have to be careful when replacing this part in Opencast since the publications players and many integrations depend on this. That is why we turned this into a separate project for early 2023.

The advantage of getting rid of this final Solr is that we can make both the admin and the presentation nodes redundant easily. That is great in terms of high availability. Additionally, we will also fix a few minor security issues along the way.

Advanced Storage Backend for Archive

The asset manager is the central storage for Opencast recordings and can contain both source and processed video material as well as additional files like metadata catalogs, subtitles or preview images.

With the growth of Opencast instances over the last couple of years, the need for asset manager storage has also increased significantly for many adopters. This led to problems due to storage system limitations.

To address this issue, the asset manager will get an updated storage backend that allows adopters to split the storage between different backends. This allows for far more flexibility and a seamless extension of existing storage.

The target release for this new feature is Opencast 15.

Subtitles as First-Class-Citizens

Subtitles have been a topic in Opencast for the last couple of years. New integrations with cloud transcription services have been added, free open source tools have been integrated and the new editor now has a subtitle editor for users to improve subtitles themselves.

While Opencast now has many tools, what is still missing is a coherent integration and a workflow easy to use for all users. The community has expressed sufficient interest in subtitles that the default workflow will automatically include support for subtitles (uploads, editing, …), so you no longer have to configure this yourself.

Doing this is part of an ongoing effort for Opencast 14.

Opencast 13 release branch cut

Hi everyone, the Opencast release branch (r/13.x) has been cut. Please check if pull requests point to the correct branch.

For a guide on what you can still add to release branches, please refer to the acceptance criteria for patches in different versions.

Remember the release schedule for this release:

  • Cutting the release branch: November 16, 2022
  • Translation Week: November 21, 2022
  • Public QA phase: November 28, 2022
  • Release date: December 14, 2022

As always, we hope to have a lot of people testing this version, especially during the public QA phase. Please report any bugs or issues you encounter.

For testing, you may use stable.opencast.org if you do not want to set up a test server yourself. The server is reset on a daily basis and will follow the new release branch with its next rebuild.

Additionally, look out for announcements regarding container and package builds for testing on list if you want to run your own system but do not want to build Opencast from source.

Crowdfunding: An Update

What’s the current state of the crowdfunding? Where are we, when it comes to our goals? What work has been done already? These are the question this article tries to answer.

TL;DR

  • We did a lot of security updates
  • Work on removing ActiveMQ is done
  • Work on two of the three Solr indexes is done
  • We have problems with the Spring update

ActiveMQ

github.com/opencast/opencast/pull/3100

The goal: Installing Opencast may be somewhat confusing to new users, partly because there are lots of different additional services to run. For a long time, one of them has been ActiveMQ which is a message broker used for inter-service communication in Opencast. Used… well… barely used, actually. With recent versions, we only needed ActiveMQ on a single server only. Since ActiveMQ is meant to distribute information across multiple servers, this meant we could also communicate with these services directly. In short, less overhead and fewer additional services to run for adopters. That is why our goal was to entirely remove Opencast’s dependency on ActiveMQ.

Current state: Work on this task has been mostly finished. A pull request removing ActiveMQ has been filed and reviewed. All that is left is a bit of cleanup work before it can be merged. This means that this is almost guaranteed to make it into the next major Opencast release.

Security Issues

github.com/opencast/opencast/security/advisories/GHSA-hcxx-mp6g-6gr9
github.com/opencast/opencast/security/advisories/GHSA-j4mm-7pj3-jf7v
github.com/opencast/opencast/security/advisories/GHSA-59g4-hpg3-3gcp
github.com/opencast/opencast/security/advisories/GHSA-mf4f-j588-5xm8

The goal: Opencast has a good track record of identifying and fixing security issues, and we had identified a few known or potential security issues we wanted to evaluate and fix, if they turned out to be problematic. That way we can keep our servers safe and avoid any spectacular data breaches.

Current state: There have been a number of security fixes for Opencast 9, 10 and 11. The issues we addressed range from limited data extraction, over privilege escalation to potential remote code execution attacks. Fixes for these have been included in the last couple of releases. We have also been able to dismiss a few reports of code we suspected to be problematic which turned out not to be a problem after all. Still, we have not yet processed the whole list of suspects. We will inform you, as usual, if we release another security patch and will keep trying to make these releases as responsible and painless as possible for adopters.

Log4j: We cannot talk about security fixes without pointing out one particular problem we faced as part of the crowdfunding. The Log4Shell remote code execution vulnerability and several additional vulnerabilities found in this library after the world’s security researchers all turned their attention towards Log4j have affected Opencast as well. We released several versions of Opencast in December to address these issues as fast as we could, since we knew that these vulnerabilities were actively exploited. To help adopters, we even decided to release new versions of Opencast 9 since it only just reached its end of life, and we knew about many adopters not having updated yet.

Solr

github.com/opencast/opencast/pull/3204
github.com/opencast/opencast/pull/3376
github.com/opencast/opencast/pull/3377

The goal: Opencast uses both Solr and Elasticsearch for full text search and caching. Both services serve an almost identical purpose. However, one of them is in desperate need of attention: Solr. We built an integration with Solr using an older version, which is both too old to easily deploy in a cluster, and not easy to update. In short, things have to change. But instead of updating Solr and still end up with two different services doing the same thing, we chose to consolidate on Elasticsearch¹.

Current state: Opencast uses Solr for three services: The series service, the workflow service and the search service. All of these services were user-facing in Opencast (Matterhorn) 1.x, which is why full text search and caching was important. The same is no longer true today, and thus the need for some of these indexes no longer exists.

We were able to completely remove two of the three Solr indexes, sparing adopters from re-indexing these ever again. The services this was done for are the workflow service and the series service. In the future, data will be requested from that database directly. The patches for these are currently being reviewed. We hope to get these merged soon to have them included in the next major new version of Opencast.

Work on the final service, the search service, is more complex and not yet done. We cannot remove Solr in the same manner, since full text search capabilities are actually used here and the service is still user-facing, being the back-end for the players among other things. We hope to still be able to make the shift to Elasticsearch for Opencast 12, but this is more challenging, and we will act with caution since it’s a central piece of Opencast infrastructure being used by all adopters.

[1] We may actually use OpenSearch instead of Elasticsearch, but that should be a drop-in replacement. We will report if it actually is. But for sake of simplicity, we stick to Elasticsearch for reporting.

Spring

The goal: Opencast uses Spring Security for handling logins and access control. We did fall behind when it comes to updating the library to its current state and are now using a version which is no longer supported. While this does still work just fine, unfortunately, like we have seen with Log4j, this bears the risk of suddenly blowing up. Thus we would like to update.

Current state: Our plan was to separate the different login mechanisms Opencast supports which are all woven into Spring Security, then start updating the core and basic login mechanisms first. At that point, we wanted to discuss further actions with the community.

Sadly, it turned out that this plan is not as easy as we hoped. Newer versions of Spring Security do not work well with our OSGi stack, and just updating even the core is not possible. Options we are now evaluating are investigating versions picked up by the Eclipse Gemini and Apache ServiceMix projects, which still provide supported versions but not the latest versions, and the possibility of support within Karaf itself which has been hinted at for the next major version but has not yet been confirmed.

Due to the not yet finalized statements about the Karaf roadmap, we decided to focus on the other tasks first, leaving this as the last potential task to tackle. The exact form of how we can/will tackle this problem and if we can completely fix this in this crowdfunding is still to be determined. We will make sure to start an open discussion about this once we have collected all information.

Questions; Next Steps

If you have any questions or want to discuss any of these tasks, don’t hesitate to bring this to the development mailing list, the Matrix chat or bring it up in the weekly technical meetings. Furthermore, if you want to help, consider reviewing any of the open pull requests linked above.

We will post again, once we have reached a new major milestone. Additionally, we will submit a session about the state of the crowdfunding at the upcoming conference. Join us there for a discussion, if you are interested.

Opencast Crowdfunding 2021

As part of this year’s crowdfunding, the Opencast board in cooperation with the group of Opencast committers and some commercial partners would like to raise money to address some underlying infrastructure, security and performance problems.

These funds will help us to future-proof Opencast, make it easier to operate and deploy, and ensure the security standard we all expect.

We are actively seeking a discussion about the technical requirements related to these changes, in particular with the group of committers participating in the technical meetings. This is something we would like to continue while working on the issues.

These proposed changes affect many core Opencast components, and as such all time estimates are subject to change. We have tried to be as accurate as possible and will strive to be very transparent what we spent time on.

Spring update

Spring Security is a powerful and highly customizable authentication and access-control framework used by Opencast to handle logins and to decide if, and how users may access different parts of the application.

The version of Spring we are currently using is out of date, and in need of an update. This will ensure Opencast’s core security layer is up-to-date, as well as resolving a few outstanding bugs.

Unfortunately, Spring closely ties into several authentication methods for Opencast (LDAP, CAS, Shibboleth, …) causing a ripple-effect of components which need to be updated and/or replaced.

Addressing the problem

To go forward, we propose to update Spring, and deactivate all external authentication methods. This first step allows us to ensure that the Opencast core authentication method works fine and does not cause problems or side effects in its shiny, new, updated form.

Once that is done, we would like to evaluate how best to continue with the external authentication components. Based on the effort we estimate for updating the modules, we see two ways of going forward and would like to decide on one of them in an open discussion:

  1. Continue as before, updating all the necessary authentication modules. This, of course, is the way we prefer if it turns out that updates seem to be easy. This will require extensive community involvement to ensure the various integrations continue to work.
  2. Only keep some basic internal authentication mechanisms and use an authentication proxy or an external plugin to provide different authentication options. This is something often used in modern applications (e.g. see Grafana) and also something which e.g. the new video portal will be using. Of course, we would provide examples for the most common authentication methods like LDAP.

We invite everyone to participate in this discussion once it is time for the decision to be made. Watch the users/dev list for that discusion.

Whatever we do, this is certainly a larger change to the internal infrastructure of Opencast and as such, we would like to not rush the change. This means we would target Opencast 12, instead of the upcoming release of Opencast 11. This allows us to detect potential issues early and also allows adopters to test and adapt their integrations if required.

We estimate approximately one to two weeks to update the core and investigate possible further steps to fuel a discussion, with five to six weeks of work time for this task if we need to update all the integrations.

Search index update

At some point as an Opencast administrator, you are almost guaranteed to have to touch one or the other of Opencast’s search indexes and so may have asked yourself, what are these indexes used for and why are there so many?

The answer is complicated and boils down to history. Some aren’t actually necessary any longer, having been created for front-ends which have long since been removed. Others are still active. As a committer group we have long been aware of the state of the indexes, especially the different types (Solr vs Elasticsearch), but “it still works” and thus is hard to justify spent time removing them.

Still, people have often faced problems involving indexes, especially over the last two years when systems grew bigger and bigger. This means that that saying “index update” can literally mean a bad day, or week for someone.

This is what we would like to address with three main goals in mind:

  1. Remove the unnecessary indexes
  2. Unify the ecosystem, consolidating all remaining indexes to use one system (likely OpenSearch)
  3. Fix major performance pain points in the infrastructure

As with the previous project, we would like to be transparent and keep the community informed about what we do, while we do it. Fortunately, for this, the tasks are more obvious. Here is a list of Opencast components and what there is to do:

  • The Search Service is one component still using the old Solr infrastructure. This index, in particular, is heavily used and heavily depends on full-text search. This means we need to update this to the new search index infrastructure while keeping backwards compatibility as good as we can.
  • The Workflow Service still contains an old Solr index which can likely be removed. No internal components require full-text search any longer. One point open for discussion is, however, the current integration in the API which might be effected in some very specific sub-queries.
  • The Series Service is also using an old Solr index. It is not used internally and can probably be removed. For external applications, the same information are available via the external API and its search index.
  • The Index Service is the heart of the Elasticsearch based services. Its problem is that instead of the well-supported high-level API, it still includes and modifies Elasticsearch internally. We should move to the new API there. This also allows us to address some known performance problems, potentially reducing the system load caused by the admin interface and/or the External API. The Index Service is used by the Admin Interface API and the External API, but we believe that only an update of the central component is necessary.

To fix these issues, we estimate a workload of about 7 weeks.

Replace/Remove ActiveMQ

Apache ActiveMQ is used to send messages between components in Opencast. Its use has been in decline over time and is now at a point where we would like to remove it entirely. This would lead to a smaller system footprint, less complexity and an easier to set-up and maintain system.

The task is relatively straightforward since messages are sent only within the admin node and not to external servers or services. This means what we need to do is to find these communication channels and decide how to wire them together instead. OSGi should help with this.

We estimate that this work will take about two weeks of work time to finish.

Security Fixes

These should need no further explaination, but if you still need one: Opencast is a public, HTTP based service with a large attack area. Attackers come up with more and more ways of getting into systems and it is our duty to keep them out. This means we need to keep our defenses up and fix even potential problems before it is too late.

Opencast has a good track record of identifying and fixing potential problems and we would like to continue this by investigating some potential problems we have identified and, of course, fix them if these turn out to be a problem.

We estimate about two weeks of work time to get this done.

Roadmap

We would like to start mid-October. In particular, we would like to get the first part of the Spring update done as fast as possible, since we need a community discussion before we start with the second half.

We would like to start with:

  • Spring Security core update
  • Remove indexes we do not need to replace (series, workflows)
  • Remove ActiveMQ
  • Security updates

Who is doing the work

We have talked to the commercial partners in the Opencast space and have identified two organizations willing to cooperate on this:

  • Loganite Inc
  • ELAN e.V.

Want to help?

If you want to participate, please contact the head of the Opencast Board, Olaf Schulte. He will discuss the best way for going forward. Contributions can be made either as contracts directly with the participating companies, or via the Apereo Foundation. Talk to Olaf if you have additional requirements or constraints.

Due to the amount of work, the execution and contribution of the code will start in 2021 but continue and be finished in 2022. Opencast 12 is our focus for the bulk of the work.