The Impact of Artificial Intelligence on the Creativity of Videos

This study explored the impact Artificial Intelligence (AI) has on the evaluation of creative elements in artistic videos. The aim was to verify to what extent the use of an AI algorithm (Style Transfer) contributes to changes in the perceived creativity of the videos. Creativity was evaluated in six quantitative items (Likert-type scale) and one qualitative question (qualitative description of the creativity expressed in the video by two words or expressions). Six videos were shown to both control (N = 49) and experimental group (N = 52) aiming at determining possible differences in creativity assessment criteria. Furthermore, both groups contained experts (Experimental, N = 27; Control, N = 25) and non-experts (Experimental, N = 25; Control, N = 24). The first round of videos composed of six videos that were the same for both the experimental and control condition (used to check for bias). No significant differences were found. In a second round, six videos were shown with AI transformation (experimental condition) and without that transformation (control group). Results showed that in two cases the perceived creativity increased in experimental condition, in one case a decrease occurred. In most evaluations no differences were observed. Qualitative evaluations reinforce the absence of a general pattern of improvements in AI transformations. Altogether, the results emphasize the importance of human mediation in the application of AI in creative production: a hybrid approach, or rather, Hybrid Intelligence.


INTRODUCTION 1.Background
Art, and its underlying creative process, has always existed in dynamic interaction with its environment and culture. Just like it was the case with the emergence of photography, it is believed 9:2 Ana Daniela Peres Rebelo et al.
that machine intelligence, also known as Artificial Intelligence (AI), will play a huge role in future artistic breakthroughs, deeply affecting the way art is perceived and created [2]. Arcas [2] defends that, just like it occurred with previous technologies, while some artists will embrace the medium, others will reject it.
We already see a similar mechanism for art production in other technologies. A photographer works together with the machine in the creation of artworks, as such, it is not only working with his own biology but with a technological creation that consists of a machine with software. In that sense, the photographer is a hybrid artist working with both his own biology as well as with human engineered technologies [2]. Arcas [2] explains that while technologies like the photographic machine attempt to mimic the human eye, AI software aims to do something distinct -to mimic the human brain.
There are multiple questions that can lead to future research on this subject. From those questions, we highlight as the most relevant for the present research the following: Firstly, studying to what extent AI can be more of an enabler that will help the artist create art [40]. Secondly, it is important to study the differences between artistic outputs made by human artists versus AI algorithms [42]. And thirdly, to study and deepen the understanding of the changes in the role of the artist and creative output that come from introducing AI in artistic creation [2].
In this paper, we explore the changes in the perceived creativity of the produced artifacts through the use of the AI Style Transfer algorithm. This AI system works by analyzing the image of a determined style, separating it from the content, and later applying it to other imagery. It captures different characteristics of the input image and combines them to create a diverse collection of differently styled images [30].

Goals and Hypothesis
In the context of the cultural shift caused by the changes in the artistic paradigm driven by the use of the Artificial Intelligence systems presented above, the goal for this research is defined as the analysis of the changes in the perceived creativity of videos produced by the application of an AI algorithm (Style Transfer) in comparison with the same videos without the AI transformation. To achieve this, the following objectives were defined: (1) to verify if the use of an Artificial Intelligence algorithm ensures a more creative artistic outcome in videos as perceived by individuals; (2) to Identify what are the creative elements that the use of an Artificial Intelligence algorithm produces in an artistic product, the videos; and (3) to describe the perception by the spectators regarding the pieces created with and without the use of an Artificial Intelligence algorithm. Based on the previously stated goals, the following general hypothesis was formulated for this study: The videos transformed through the use of an Artificial Intelligence algorithm (Style Transfer) are perceived as more creative in general than the ones without transformation.

THEORETICAL FRAMEWORK 2.1 Defining and Conceptualizing Artificial Intelligence
In the context of this study, it is crucial to highlight the definition of Artificial Intelligence and the different branches of this area in order to further understand the functioning of the Style Transfer algorithm.
According to Wang [53], intelligence consists of a movement towards the acquisition of competence and knowledge. It also manifests itself through the competence of problem-solving. This faculty, which was once believed to be limited to the human experience of reality, can now be present in AI systems. The author highlights that the creation of robots and other intelligent systems seems to reinforce the possibility of the presence of intelligence in artificial entities [53]. Another important aspect to consider for this study is the definition of the creative product. Amabile [1] characterizes this term and relates it to a definition of creativity. The author states that a product can be considered creative when those familiar with the field in which it was created independently recognize it as such. In the present research, the previously defined characteristics of creativity, presented by Avdeeff [5] and Amabile [1], will be analyzed in the context of the outputted videos transformed through those of Style Transfer.
Another aspect that is greatly relevant to present research is the concept of computational creativity. This consists of a field that studies the development of software that generates creative work. Creative ability was previously believed to be a trait unique to humans. Over the past few years, the question of if an algorithm can, independently, create artwork that would be indiscernible from work made by humans has been raised. This begs the question what the implications would be, for instance, whether or not the software and algorithms can be seen as creative [44]. Late developments on Artificial Intelligence systems challenge the notion of humanity holding exclusivity in art creation.
Boden [20] defines three types of creativity which are important when exploring creative AI systems. Firstly, there is combinational creativity. Here, familiar ideas are combined for the generation of new ideas. As stated by Frigotto and Riccaboni [29] the combination of different elements is a crucial element for creativity, and an essential part of the creative process. This connects with the definition of creativity present by Henriksen and Mishra [31] who define the concept as the connection and connection and fusion of preexistent ideas for the generation of new ones. Secondly, exploratory creativity is defined. Here, structured conceptual spaces (which the author defines as thought styles) are explored for the creation of new ideas. Thirdly, there is transformational creativity. In this last one the dimension of the conceptual space is transformed, originating new structures. Based on the three creativity types, Basalla and Schneider [18] present deep learning as an important instrument for generating creative AI, drawing a parallel between how humans and machines create. The authors present Style Transfer networks as useful tools for combinational creativity [30].
Some authors, such as Ornes [44], stress the idea that the usefulness of these systems exists merely as a tool. Others, such as Park [46], see the possibility of viewing AI software as creative agents as something that can soon become a reality. Finally, others, like Audry and Ippolito [4] state that whether AI can be truly creative or not is not the right question and that we should instead ask what changes this brings to artistic production. Regardless of the answer, the possibility of measuring creativity seems to be relevant and can contribute to a better understanding of the relationship between creative art and AI systems.
Considering in this study we aim to understand if the use of an AI algorithm contributes to changes in the perceived creativity of the videos, it is extremely important to understand how those changes can be measured. During the next few paragraphs, we will briefly present the different categories of tests that measure creativity and specify the type of test that was used for this research.
Chakrabarti and Sarkar [21] analyzed a series of 74 creativity tests and divided them into five different categories, according to the different approaches. The first category is "Ability-based tests". This refers to tests that seek to measure the strength of the subject's creative capabilities. The second category, "Character-based tests", works through the identification of characteristics of an individual's character that are believed to be fundamental aspects that contribute to the creative ability. Thirdly, there's "Determination of past creative activities". These tests aim to measure creativity through the number of creative tasks performed in the past [21].
The fourth set of tests presented by the authors are "Outcome-based tests", which work through the analysis of creative outcomes. These evaluate the creative product instead of focusing on the individual. Some of the tests present in this group include the Creative Product Semantic Scale (CPSS), the Consensual Assessment Technique (CAT), and the Student Product Assessment Form [21]. Since this article focuses on the product, the "Outcome-based tests" were the type of creativity tests that were employed.
Finally, the last group of tests classified refers to "Environment suitability tests". In this type it is considered that a specific environment is a factor that contributes to the development of creative talent and creativity enhancement. As such, these tests seek to identify the creative context of the individual [21].

Artificial Intelligence Software and Algorithms for Creative Use
Over the past decades, numerous algorithms have been generated to be used in the creation of artistic products, such as the one employed by Aaron, a robot that runs on an algorithm programmed by Cohen, a visual artist. The created algorithm has been perfected throughout the years and is used by the artist in the creation of his paintings [23]. Another example is the (CAN) Creative Adversarial Networks system, proposed by Elgammal et al. [27], which outputs novel art pieces inspired by a series of paintings, and Sony's Flow Machines technologies, an AI system used for music creation [5].
As mentioned above, Style Transfer is the Artificial Intelligence algorithm that will be used in this research. In this system, the AI system captures different characteristics of the input image, separating it from the objects in the frame and applying it to other images in order to create new stylized content [30]. The Style Transfer algorithm used for the empirical study is the one composed by Jin [34]. The code aimed to improve on earlier works. The programming language in which it has been made is Python and it makes use of following libraries: Tensorflow, Numpy, SciPy, and MoviePy.
Since AI is a fairly recent area of study, there have not been conducted a large number of studies that relate the use of AI systems with creative productions, after an extensive search, the following were highlighted as the most important sources of inspiration for the conception of this research.
Firstly the study by Elgammal et al. [27] concerning Generative Adversarial Networks. In the research, GANs were used as a way of bringing artistic works to life. The process goes as follows: first, the artist selects numerous artworks, then it feeds those to an algorithm. The algorithm functions in between two opposing forces, one that makes it create new pieces by attempting to mimic those images that were used as input, and another one that pushes it into creating something different [42].
After the works were created, the researchers performed an assessment: a group of individuals were asked to analyze the pieces in terms of creativity, with the aim of understanding whether the subjects could discriminate works made by humans from the ones generated by AI. The authors also made use of qualitative analysis in order to better understand which subjective characteristics the subjects projected unto the artworks [27]. For this part, there were selected two collections of works created by human artists and four groups of works created by the system [27]. The study concluded that frequently, the human subjects were unable to discriminate AI art from humancreated art (approximately 75% percent of the time). Besides this, subjects described the works with the terms "(...)"intentional", "having visual structure", "inspiring", and "communicative" at the same levels as the human-created art" [42].
Another relevant study is the one conducted by Hong and Curran [32]. In their research, the authors explored if the perception of the creativity of several art pieces was impacted by what was perceived to be the identity of the author (Artificial Intelligence system or Human). This study was relevant, not only because it deals with a similar thematic (the perception of art pieces) but also because it is built on an extremely concise and meticulous method. First, the individuals were randomly selected. Those who suspected what the study was about were excluded from the study and the other ones remained. Next, the participants were shown a collection of art pieces by AI and human artists and asked to rate them on a variety of aspects on a Likert scale from 1 to 5 often applied by professionals for the measuring of creativity in artworks. The scale that was given to the participants makes use of the following aspects to measure the creativity of the artworks: "(...) originality, degree of improvement or growth, composition, development of personal style, the degree of expression, experimentation and risk-taking, aesthetic value, and successful communication of ideas (...)" [32].
The results were later measured through the use of a t-test, and the values of the several groups were compared. Contrary to the previous study presented by Elgammal et al. [27], this study concludes that (1) there are still differences between the ratings of the artworks generated by Artificial Intelligence and the ones created by humans, as such, it appears that the outputs of the algorithm do not show the same level of creativity as human-made artworks and that (2), the perceived identity of the artist does not affect the rating of the artworks [32].
Finally, another study that was made, regarding the use of AI for creative production, was the one conducted by Chen et al. [22]. In this study, the authors proposed an AI-based, data-driven approach to ideation for design, through the use of generative adversarial networks. It performed a design case study to see if this system would contribute to the ideation process. In this study, both experts and non-experts in design participated. The subjects were distributed through two groups, a treatment group, and a control group. The participants in the treatment group made use of the created system. The participants in the control group did not have access to this tool, however, they could search in Google to help with the ideation process. The sample consisted of engineering students, but only two subjects of each group had studied design engineering. The study concluded that the approach can help create multiple different associations between concepts from different categories, and assist in the process of generation of new ideas in a fast and easy manner, leading to a higher quantity and novelty of the generated products and concepts [22]. This study presents AI as a tool for creative ideation. However, it does not test the level of creativity of the concepts generated. Considering all the studies here summarized, some ideas come up: (1) Although AI seems to be able to produce some of the same attributes of creativity as humans (as perceived by people), it is unable to produce artworks to an equal level of creativity [27,22]. Therefore, new studies on the creativity of the generated outputs are useful for clarifying this issue; (2) The artworks generated in the described studies were paintings and drawings. Therefore, it is important to verify to what extent other artistic forms (like video) can benefit from the use of Artificial Intelligence systems. New studies addressing this issue seem to be relevant; (3) Following the best practices of using experts and non-experts, future studies, where possible, should use participants from the two groups; (4) Following also the best practices concerning the type of data collected, future studies should measure, at the same time, qualitative and quantitative aspects of creativity; (5) So far, to our knowledge, the research undertaken on creativity and AI has used GAN and CAN systems. None of the studies have employed Style Transfer algorithms. This is relevant since that algorithm allows us to compare the same video creations transformed with an Artificial Intelligence (AI) algorithm and without the use of Artificial Intelligence (No-AI), making more reliable and accurate comparisons between AI and No-AI.
Those challenges are considered in the empirical research undertaken within the scope of the present paper which is described below.

METHOD 3.1 Experimental Study
An experimental design was chosen to verify to what extent an AI algorithm improves the creativity of videos. Twelve videos were used, of which six were equal (unaltered) for both the control and experimental condition. The remaining six were left unaltered for the control condition, while an AI transformed version was shown to the experimental condition. The AI videos were transformed through the Real Time Style Transfer algorithm [34]. Both quantitative and qualitative data were collected as the two types of data complement each other, and, therefore add richness to the study [41].
The present study is designed to verify to what extent an AI algorithm improves the creativity of videos. The topic at hand is creativity, which very strongly resides within the world of ambiguity and abstraction. Given this context, qualitative research contributes to the understanding, in a richer sense, of the ways in which the participants perceive the creative aspects of the generated outputs. Quantitative research, elseways, is connected to the gathering of numerical data. This type of research allows us to collect data more quickly, analyze it, and discover cause-effect relations from the obtained results [28]. However, the distinction between qualitative and quantitative research is somewhat blurred. In the line that separates quantitative and qualitative studies there are mixed methods research designs, which combine both quantitative and qualitative components, and there are many different research designs [38,39]. In the present study, a traditional quantitative research design will be used: experimental research. It contributes to strengthening the evidence of possible differences in the perceived creativity of the videos. Furthermore, a qualitative component will be added. This qualitative component contributes to the perception of the subjective nature. It allows researchers to grasp the complexities of existence and the rich and meaningful ways in which they are experienced by the individuals [24].

Participants
The participants (data sources) for the empirical research consists of 52 experts in visual arts and 49 non-experts. Experts were defined as those who had a university degree that specialized in Arts (for example the bachelor's degrees "Plastic Arts and Multimedia" and "Painting"). Non-experts were defined as those who had a university degree that did not specialize in arts. The experimental group included 27 experts and 25 non-experts and the control group included 25 experts and 24 non-experts. The reasoning behind the distinction between experts and nonexperts is: (a) it is expected they are different regarding the evaluation criteria of creativity in visual arts, the focus of this research, and (b) they constitute the public of the visual art products. Therefore, a wider range of people were the source of data and the results are more generalizable. The reasoning behind the sample size stems from an attempt to reach an equilibrium between the validity of the study and the ability to perform the tests on time. A too short sample would produce low validity results since it could be due to specificities of the participants. On the other hand, a very large sample would be hard to carry out on time. The decision on the number of participants was made in agreement with the advice of two senior researchers. Furthermore, to control for bias, data regarding "gender" and "year of birth" were also collected. The demographic differences between the control and experimental group (gender, age group, number of experts/non-experts) were controlled for until demographic equivalence between experimental and control group was reached. Accordingly, gender, age. and expertise were tested for possible differences between the groups (see Section 3.4). Despite our concern, no significant differences were observed.

Materials
For this study, 12 excerpts of artistic videos created by Nuno Barbosa (permission was granted), between 45 and 115 seconds long each, were screened. The videos are artistic and include visual transformations and elements of animation and design, some resembling a moving collage. The videos often depict dream-like atmospheres and have previously suffered visual transformations that have modified the colors, elements, and atmosphere of the pieces. The reason for the choice of artistic videos that were aesthetically manipulated strongly relates to the visual nature of the Style Transfer output. If the pieces were not artistic, the stylized outputted versions would more likely be considered as more creative than the inputted ones, making this a biased comparison. The videos were split into two groups of six videos. The first group (Shared Videos) remained unaltered (no AI), with the exception of segment selection, after which they were coded (new title for the video) and labeled (See Table 1). This group of videos was used to check for bias.
The second group (Differentiated Videos; Table 2) was altered by the Style Transfer algorithm for the experimental condition, while the videos remained in their original state for the control condition. The input paintings (Styles) were chosen based on two criteria: (1) the pieces had to differ from each other stylistically, and (2) they had to be pieces that were universally recognized as art pieces. In total, three styles were applied, each occurring in two videos (see Table A1 in Appendix A). Regarding the style applied to each video, the choice was made based on the combination of art pieces and videos that generated visual results with less noise and visual artifacts.
Instrument. A questionnaire (see Appendix B) was created to measure creativity which included quantitative and qualitative items. Quantitative items consisted of ordinal items on a Likert scale from 1 (very low) to 5 (very high), regarding the various aspects that compose creativity: "Aesthetic Quality", "Value of The Concept or Idea", "Communicative Power", "Expressiveness", "Originality of The Composition", and "Level of Innovation". The reason for the choice of 5-scale Likert items lies within its frequent use and, consequently, a higher level of familiarity for the participants. The quantitative items were adapted from the questionnaire employed by Hong and Curran [32].
A qualitative question was made for qualitative evaluation of the creativity present in each video: the participants were asked to write down two words or expressions describing the video in terms of the creativity observed. These expressions are then interpreted and converted to one of the six elements of creativity they were considered to correspond most strongly to. Therefore, "Aesthetic Quality" was used to categorize words and terms that describe the visual and sonic characteristics of the video. "Value of the Concept or Idea" was the category in which words that describe concepts associated with the videos were inserted. "Communicative Power" included the words that related to the ideas transmitted by the artwork and the emotional impact generated by the videos in the participants. "Expressiveness" was used to aggregate adjectives that describe the elements that contribute to the communication of the ideas and emotional states conveyed in the pieces. "Originality of The Composition" included terminology that aims to describe how different and unique the pieces are. Finally, "Level of Innovation" was used to aggregate words that describe the novelty and value of the creative ideas implemented in the video.
Four items were added regarding socio-demographics: gender, year of birth, educational level and professional specialty.

Procedure
The videos were uploaded to a YouTube account and hidden from the general public so that these could only be accessed through a direct link. That procedure intended to prevent other individuals from accessing the videos in the experiment. Comments were disabled on the original page and the likes/dislikes ratio was hidden to avoid possible bias.
In the process of selection of the excerpts, particular attention was put in two specific aspects. First the level of continuity. The excerpts were cut into carefully chosen segments in order to make the process of visualization feel as uninterrupted as possible and ensuring the pieces appeared complete. Another important aspect was the exclusion of written elements, as these could influence the evaluation of creativity and make the responses to the questionnaire of the study biased.
A snowball sampling procedure was adopted. The sample was made by identifying possible subjects from the personal network of the researchers. Those subjects were contacted by email/phone number or Messenger and asked to participate in the research and to indicate other possible subjects who meet the inclusion criteria. The subjects were informed that the videos were not created by the researchers. They were asked not to share the questionnaire or its contents with noneligible individuals. Experimental and control conditions were randomly assigned to the subjects. The sampling procedure was stopped when the intended number of subjects was reached. In total, 198 people were directly contacted.
The research protocol was composed of an introduction explaining the objectives of the study (without disclosing the use of an AI algorithm), an informed consent, and six videos, each one followed by items from questionnaire (see Appendix B). The email address of the subjects was provided by those subjects who would like to receive a summary of the results of the study. In total 118 individuals participated. In order to keep the ratio regarding the socio-demographic variables balanced between both groups,17 participants were excluded. The participants were selected randomly, considering the existing imbalance between the experimental and control conditions. The final sample size included 101 participants, 52 in the experimental group and 49 in the control group. Table 3 provides an overview of the participants. The Chi 2 test reveals that both male and female genders are equally represented in the control and experimental condition (Chi 2 = .131; df = 1; N = 101; p = .718). The same holds for the distribution of experts and non-experts in the control and experimental condition (Chi 2 = .131; df = 1; N = 101; p = .928). Lastly, the mean age of Control condition. Two sets of six videos were created for the control condition. The first set (set K) contained three of the Shared videos and three of the Differentiated videos (but without AI alteration). The second set (set L) contained the remaining three Shared and Differentiated (but without AI) videos. Furthermore, both sets (K and L) were split in to two separate questionnaires with the same videos, containing the same videos in reversed order (set Kv and Lv) to avoid bias, resulting in a total of four questionnaires (K, Kv, L and Lv).
Experimental condition. The experimental condition followed the same procedure. Two sets of videos were created to be shown in the questionnaires (set X and Y) which both Shared and Differentiated videos. However, the Differentiated videos were transformed through the use of the Style Transfer algorithm. They were then split in two more questionnaires, again containing the same videos but in reversed order (set Xv and Yv) to avoid bias. The procedure resulted in a total of four questionnaires to be distributed amongst the participants in the experimental group (X, Xv, Y and Yv). Figure 1 shows which videos were used in each questionnaire. The arrows in the figure represent the flow of the questionnaire. What the figure shows is that some questionnaires start with a Differentiated video, while others start with a Shared video. This, accompanied by the reverse order, minimized the order and primacy effects. Furthermore, the Differentiated and Shared videos were shown in alteration of each other.

Dimensions of Analysis.
A model of important concepts for the conceptualization of this study was conceived in order to clarify the relevant aspects to be explored (see Figure 2). As mentioned above, the various elements evaluated were taken from the definitions and concepts explored in the theoretical framework and in the questionnaire used by Hong and Curran [32]. The items "Experimentation of Risk Taking" and "Development of Personal Style" from the original questionnaire were not applied to this study due to focusing on creative aspects which relate to the artistic process, and are better analyzed through the study of one's progress and personal development instead of the final product.
In front of each item on the questionnaire a small sentence clarifying it was written in order to make the questionnaire accessible for those who are not familiar with the terminology, which was important due to the inclusion of non-experts as respondents. This clarification was also contributed to ensuring that all the participants interpreted the items similarly.
The dimensions chosen for the analysis of the concept of creativity were defined based on some of the authors and concepts introduced in the theoretical framework, such as: (a) the object of analysis (the creative product) defined by Amabile [1]; (b) the importance of communication in multimedia products as introduced by Dolese [26]; (c) the definitions of creativity presented by Chakrabarti and Sarkar [21]; (d) the concept of originality as defined by Runco and Jaeger [48]; (e) the definition of expressiveness introduced by Osborne [45]; and (f) the concept of creative process as defined by Taylor [51].
The Impact of Artificial Intelligence on the Creativity of Videos  According to Amabile [1], a product can be considered creative when it is evaluated as creative by those familiar with the domain in which it was created. In this study audiovisual pieces are the object of analysis, therefore the dimensions "Value of The Creation" and "Originality of The Product" were considered relevant. The first one relates to the core aspects of creative products in which the piece might be considered more or less valuable and is measured through the indicators "Aesthetic Quality", "Value of The Concept or Idea", "Communicative Power", and "Expressiveness". The second one refers to how novel and innovative the piece is, which was measured through the indicators: "Originality of The Composition" and "Level of Innovation". In the next paragraphs, the various indicators, the reason for the choice of each one as well as the sources of inspiration for their creation will be presented.
Aesthetic Quality. The first element of the questionnaire asks participants to evaluate the aesthetic quality of the artwork. The source of inspiration for this item was the element "Aesthetic Value" present in the questionnaire by Hong and Curran [32]. Next to it, the following clarifying sentence is presented "I consider that the various elements of the composition are properly constructed and visually interconnected".
Value of The Concept or Idea. This item was inspired by the ideas presented by Chakrabarti and Sarkar [21], who define the creation of ideas as an important part of the creative process. The element was further clarified through the use of the sentence "I consider that the idea or concept is rich and profound" in the questionnaire.
Communicative Power. Not only the quality of the message is relevant, the strength of the communicative ability is also important. This indicator was selected for the study, due to measuring one of the core aspects of multimedia products, the ability to communicate emotions and ideas. The element "Communicative Power" was also based on the questionnaire by Hong and Curran [32]. Next to the item, it was written the following clarification: "I consider that a message was transmitted".
Expressiveness. Expressiveness relates to the elements that contribute to the communication of emotional states [45]. This indicator was inspired by the item "Degree of Expression" present in the questionnaire. In front of this element the following clarification was made "I consider that the elements are organized in a way that contributes to the communication of the message".
Originality of The Composition. As stated by Runco and Jaeger [48], originality is one of the key aspects of creativity. Therefore, this item was chosen to measure the impact of the AI algorithm on the originality of the pieces. This item was also inspired by the element "Composition" present in the questionnaire that served as a reference. The sentence "I consider that the composition is outside of the norm", was written next to the item in order to further explain it.
Level of Innovation. According to Taylor [51], Innovation relates to the creation of something novel that positively impacts the world. Therefore, this item was considered as important to the understanding how the AI algorithm impacts the originality of the creative product. This was also inspired by the item "Degree of Improvement and Growth" from the questionnaire and the ideas of Chakrabarti and Sarkar [21], previously presented. This element was clarified through the use of the sentence: "I consider that I saw something new".

Quantitative Data Treatment.
The quantitative evaluation of the six short videos without AI intervention made by the experimental group was compared with that made by the control group. It was expected to find no significant differences between the experimental group and the control group using the Mann-Whitney U test.
The quantitative evaluation of the six short videos with AI intervention (experimental group) and without AI intervention (control group) was compared. It was expected to find significant differences (higher creativity in the AI intervention short videos). In order to treat the data a Mann-Whitney U test was applied through the use of the software SPSS.

Qualitative Data Treatment.
A content analysis of the words describing the videos was performed. The words used by the experimental group were compared with the words used by the control group about the shared videos, which aimed at verifying if they were similar. Following this, the corresponding comparison is performed regarding the videos which have two versions (AI and No-AI). This process was made using Nvivo software, version 12. The data was classified according to six main categories adapted from the concepts expressed in the quantitative items. Additionally, a category "Others" was added for responses that were outside of the scope of the question, such as: "I don't know what term means".
Two researchers with varying backgrounds coded the responses independently of each other. Where differences in coding occurred, they were discussed until an agreement was reached. In an attempt to attain a higher level of accuracy, researchers from both psychology and arts were involved in the coding process.

Quantitative Data.
This section analyzes the quantitative responses from the participants in regard to each element of creativity per video. First, results are checked for bias using the shared videos. Then, if no bias is found, the hypothesis that AI increases the perceived creativity can be tested using the differentiated videos. To perform the analysis, the Mann-Whitney U (MWU) test in conjunction with a Bonferroni correction was applied. Although the parametric betweensubject t-test and MWU on a 5 point Likert-scale generally show similar results [54], for this study the MWU test was chosen since for each element of creativity, one question of ordinal level was provided to the participants. Hence, no average can be constructed for each element that more closely represents values of interval-scale. Non-parametric tests are more suitable [19]. Furthermore, a Bonferroni correction is applied to account for type-I error due to the large number of tests used to answer the hypotheses [3].
To check for bias, data on six videos were collected through a questionnaire with six items of which each corresponded to an element of creativity. This results in a total of 36 tests. Hence, the adjusted α-level is set to 0.05 36 = 1.39 × 10 −3 . Then, when a significant difference is observed, it can be considered a true positive, resulting in rejection of the null hypothesis (there is no bias). After running the MWU tests, no significant differences were observed (see Appendix C). This means that the experimental and control groups are considered equivalent regarding the criteria that contributes to the judgement of the creativity of videos. Therefore, all results found during the comparison experimental versus control of the evaluation of the differentiated videos are considered to be a consequence of the characteristics of the audiovisual pieces.
Concerning the differentiated videos, the same corrections were applied (Bonferroni correction) since the analysis is analogous, except here the tested hypothesis is that AI transformed videos (experimental condition) are perceived as more creative than No-AI videos (control condition). Some significant differences were found, in multiple directions. Two dimensions were rated significantly higher in the experimental group and one dimension was evaluated significantly higher in the control group. In three out of the six differentiated videos no significant differences were observed. In particular, in video D1 the control group rated significantly higher in Aesthetic Quality then the experimental group. In videos D2 and D6, however, the experimental group rated higher than the control group in Originality and Innovation, for videos D2 and D6, respectively (see Table 4).
Furthermore, a post-hoc analyses after collapsing the elements of creativity across the differentiated videos with Bonferroni correction (α = 0.05 6 = .008) revealed that the experimental group reported higher levels of originality (M = 3.67, SD = 0.78) than the control group (M = 3.27, SD = 0.81); t = −2.740, df = 99, p = .007. Because this is an unplanned post-hoc analysis, a Bonferroni correction was necessary [3].

Qualitative Data.
Each of the participants was requested to qualify the creativity of the videos using two words or expressions. In total 1,208 references were coded in one of seven categories: one of the six elements of creativity ( Figure 2) and the remainder category "Others". The category "Others" refers to references that were excluded from the analysis due to not being real responses to the question. For instance, answers such as "I don't know" or just a comma were considered invalid responses. In total, 20 references were coded as "Others". In "Aesthetic Quality"  453 references were coded, in "Communicative Power" 329, "Value of The Concept or Idea" 239, "Expressiveness" 61, "Level of Innovation" 27, and "Originality of The Composition" 79. Considering the differences between control and experimental conditions, shared videos and differentiated videos have been separated. For the shared videos, the qualitative analysis was used to see if there was an equivalence in the criteria used for the qualitative evaluation of creativity between both the control and experimental groups.
A list was made of the most used terms by both groups. Similar terms such as "Color" and "Colors" were aggregated. It was observed that, in general, most of those terms were used with a similar frequency to evaluate both groups. Moreover, all the 10 most frequent were used by both groups. See Table 5 for further details.
In the differentiated videos, a total 305 references were coded for the experimental group and 293 for the control group. As shown in Figure 3, the most coded element for both groups was "Aesthetic Quality". "Level of Innovation" was the least coded element of creativity for both groups. Table 6 shows the frequency of responses related to each element of creativity per video for both the control (No-AI videos) and experimental condition (AI Videos). The most frequent category in video D1 is "Aesthetic" (Aesthetic Quality) for the experimental group (AI video), while the control group most frequently referred to "Concept" (Value of The Concept or Idea). For video D2, a roughly equal distribution can be observed regarding the frequency an element of creativity was referred to within both the control and experimental group. The same is observed for videos D4, D5, and D6. Finally, for video D3, the control group attributed most references to "Aesthetic", while for the experimental group most points were shared between "Aesthetic" and "Concept".
To conclude the analysis, the responses on each video were further analyzed to understand the underlying nature: whether positive or negative qualities were attributed to the elements of creativity.
In general, qualitative differences between the AI and No-AI versions of the videos were found. Those differences express the transformation made by the use of Style Transfer. Some of those differences are positive such as for the category "Value of The Concept or Idea" of the AI video D5, in which more diverse and abstract terms were used to describe the concepts of the video. Similarly, in the same category of the video D6, a wider use of abstract terms like "Shifted Reality" and "Parallel Universe" was observed. Some of the differences derived from the use of the   algorithm are negative, such as in the category "Aesthetic Quality", of video D1 (AI) that was described as "Confusing" and "Strange" and video D3, which in the same category was described as being excessively filled. Detailed description of the qualitative results of each video is present in Appendix D.

Discussion
The results of this study show that the implementation of this algorithm can raise the evaluation of the elements of creativity, as it was the case with the videos D2 and D6. Style Transfer can also negatively impact the public's perception of creativity, as it was the case in the parameter "Aesthetic Quality" of the video D1. However, as observed in the videos D3, D4, and D5, in some scenarios, the use of this algorithm can also have no significant impact on the public's perception of creativity. It is important to understand why these scores differ, as well as understand what Style Transfer's success or failure depends on in order to be a viable technique to enhance the creativity in videos. A number of aspects might play a role in the disparity among the evaluation of the different videos. In the next paragraphs each of the videos will be analyzed and possible reasons for differences in the score of the various elements of creativity will be considered. In some cases it seems that certain visual irregularities dissipate due to the algorithm merging and blending these irregularities with the general environment, such is the case in video D2. Here, the category "Originality of The Composition" was rated higher as the result of the algorithm, which might be due to two reasons. First, the fact this is a novel technique, that is not often employed in videos, secondly its ability to enhance certain visual elements of the video. Here, the tree branches, mountains, and moon, were aesthetically redefined acquiring the looks of a painting and enhancing the environment and emotional impact of the piece. Furthermore, for this particular video, the visual content was produced by utilizing a green screen. This added elements that appeared to be two dimensional in a three-dimensional environment. This can create a sensation of "out of place" to the viewer. When the algorithm was applied on top of this video, the twoand three-dimensional aspects merged to form a more coherent scenery: the clear distinction between two-and three-dimensional aspects disappeared, making the environment more consistent. Regarding video D6, it is important to consider that the original excerpt of this piece was more visually clean than most of the other videos, which might have made it as if the implementation of Style Transfer denoted the video of a higher visual richness. In most scenes, the algorithm created shapes very distinct from those existing in the original video, contributing to higher ratings in the category "Level of Innovation".
AI enhanced certain (creative) aspects of the video in a positive sense. However, it's impact was also noted in the opposite direction. In some videos the overall quality of the image was affected. The algorithm failed to properly employ the style in a way that creates a clear distinction between shapes in areas with less contrast making the images at times unclear. Furthermore, in videos that featured many variations in lighting throughout the video (creating optical flares or lens flares), and many shots with blur, the temporal consistency of the image is affected, resulting in visual noise in some scenes. In the video D1, the Style Transfer algorithm took away from the "Aesthetic Quality" of the video. This might be a consequence of the style employed, which faded the colors, contrasts, and the depth of the image. Furthermore, some parts of the video included previous visual transformations, textures, and layering of images, the employment of the chosen style negatively impacted those elements making them imperceptible. The video D1 aims to present memories. To clearly communicate this, the variation in colors and textures that emerges in the video is necessary. By applying the particular style (from the painting The Great Wave off Kanagawa by Katsushika Hokusai, 1831), through the Style Transfer algorithm, those elements lose definition. Therefore, important aspects that contribute to the communication of the story are lost, and the concept becomes unclear.
Lastly, in some videos no difference was observed. Regarding the video D3, this could be due to the fact that in this piece, a series of graphic elements are composed on top of buildings, interacting with them and contrasting the rich colors of the designed elements with the textures and neutral tones of the buildings. In the context of the Style Transfer algorithm, this does not appear to bring anything particularly novel to the composition. On the one hand, it contributes to the creation of an interesting aesthetic, on the other hand, however, it takes away from the feeling of two different coexisting elements, making the tones, textures, and lines more uniform. The same appears to happen in the videos D4 and D5. Both of these audiovisual creations included shot and digitally designed objects. Here, the contrast between the shot video and the 3D and 2D animated elements is lost. Contrary to video D2, in which the AI version had significantly higher ratings in one of the six elements of creativity brought by the uniformization of the video, in D4 and D5 the contrast was intentional and contributed to a richness in the aesthetic of the pieces. Furthermore, before the transformation, the digitally produced elements had colors that differed from the rest of the scenario, which contributed to them standing out and enhanced the contrast between the various objects in the picture. By applying the Style Transfer algorithm, this difference was erased. Just as in the previous video, this AI system added a new visual environment and contributed to the creation of a new aesthetic, while simultaneously removing some aspects that influence the video's creativity, therefore not producing any significantly higher or lower values in the evaluation of the elements of creativity.
It appears that when the algorithm is applied to videos that intentionally contrast graphic elements with shot video to create a visually diverse environment, the creativity ratings are unaffected. On the other hand, in videos in which the goal is to create an immersive environment and to make the elements in the image uniform, such as in D2, the audiovisual piece is positively affected by the use of this algorithm.
Regarding the qualitative data, results show also that similarities and differences were found between the experimental and control groups. Taken together, qualitative and quantitative results show that the AI system, in some cases, contributes to the improvement of the creativity expressed in the videos, as it was the case for the videos D2, in the category "Communicative Power" here, the words describing the emotional states were more positive for the AI video, than for the No-AI video, in which the piece was seen as "Abominable" and "Exaggerated". Something similar happened in the category "Value of The Concept or Idea" of the videos D6. Here, the application of the algorithm caused the participants to use terminology that is connected with an altered state of perception instead of terms that simply describe the objects observed in the AI video as it happened with the No-AI version.
In other cases, the application of the algorithm seems to negatively impact the quality of creativity, as it was the case in the category "Communicative Power" of the video D1. Here, terms such as "Strange" and "Confusing" were used to describe the pieces. Furthermore, in the category "Expressiveness" of the same video, a wider variety of responses were given for the No-AI video.
To conclude, it was hypothesized that the transformation through the use of the Style Transfer algorithm would make the pieces be perceived as more creative. However, it seems that AI systems (in this case, Style Transfer), per se, are not sufficient to improve the creativity expressed in the videos, but maybe need the artist's intervention instead. That complex influence highlights the importance of the concept of Hybrid Intelligence [25]. As previously mentioned, humans and machines have complementary skills. Therefore, this concept consists of the convergence of Artificial Intelligence and Human Intelligence, which can unite these different strengths in a way that benefits the world.

Limitations and Future Studies
One limitation for this study consists of the fact that only one Artificial Intelligence algorithm was studied: Real Time Style Transfer by Jin [34]. This means the results only apply to that algorithm. In the future it will be important that similar studies are conducted with other AI systems, in order to consolidate the obtained results.
Creativity is one of the characteristics of the videos and the way they are perceived. Other aspects must be considered in future studies. For example, the preference, the beauty, and the appropriateness of the videos to specific objectives and contexts.
Furthermore, the transformation produced by the algorithm was introduced on top of a finalized art piece. In the future, it will be important to conduct studies in which the Style Transfer is applied during the creative process. This limitation encourages future studies. Also, the study focuses solely on art videos, other art forms should also be explored in future studies, equivalent to the one here presented. Another limitation stems from the fact that the study was conducted online due to the pandemic, which made it harder to achieve ideal experimental and control conditions.
Despite those limitations, this study is expected to contribute to the understanding of the impact of the use of Artificial Intelligence systems in arts and entertainment. Human beings appear not to be replaceable by the algorithm. The new avenues that appear by the use of those systems seem to be promising.

CONCLUSION
The present paper aimed: (a) to verify to what extent the use of an Artificial Intelligence algorithm ensures a more creative artistic outcome in videos as perceived by individuals; (b) to identify what are the creative elements that the use of an Artificial Intelligence algorithm improves or deteriorates in an artistic product, the videos; (c) to compare the perception by the spectators regarding the pieces created with and without the use of an Artificial Intelligence algorithm.
The results show that: (1) for some videos there seems to occur an improvement in the elements of creativity originated by the implementation of the Style Transfer software; (2) in other videos the opposite occurs. Certain visual aspects can affect the algorithm's success in increasing creativity, and as the Style Transfer is a more utilized process, it's novelty will decrease. The results reinforce the idea that it is important that this system is mediated by artists, reinforcing the concept of Hybrid Intelligence presented by Delleman et al. [25].
In the past, the introduction of new technologies in the art sphere caused some transformations in the field, such as the photographic camera. Likewise, it is expected that Artificial Intelligence will deeply transform the artistic world [2]. AI systems have already been shown to help in the process of ideation, as presented in the study by Chen et al. [22], which is an important task for artistic creation.
Since humans and machines have complementary skills [24], these technologies might become useful tools for artists, allowing them to transform pieces in ways that would be impossible or very time-consuming without them. Furthermore, AI systems do not appear to originate a creative result on their own, but to benefit from the presence of humans. In this study, the Style Transfer was not used as a tool to serve the creative product but instead applied with neutral research purposes. The results showed that by doing this the evaluation of creativity varied irregularly, sometimes positively, other times negatively. Therefore, the artist does not appear to be replaceable, as their central position during the creation of art pieces is reinforced by this study. The results seem to indicate that the concept of hybrid intelligence will be a promising path for the application of these technologies. In future research the use of such algorithms by artists can help verify to what extent these systems can contribute to an improvement of the creativity of the products when used as a tool during the creative process.
The limitations of the present study encourage further research that can contribute to the study of other AI systems, exploring elements such as the preference, the beauty, and the appropriateness of the videos and that analyze the use of Artificial Intelligence systems during the creative process rather than a post-production application. Despite the limitations, the goals of the present study were reached.

APPENDICES A APPENDIX
The following table (Table A1) shows styles from which painting were applied to which video.

Shared Videos
The shared videos between both control and experimental group can be found through the following links: • S1: https://www.youtube.com/watch?v=VVqsObss9Cs.

B APPENDIX
The participants were shown a questionnaire containing the questions as shown in Table B1. Additionally, before starting the questionnaire, the participants were shown an instruction alongside contact information of the main author. Furthermore, each video had one question in which the participants were asked to describe the video in one term, for example: "out of my league" and "beautiful" both are considered terms.

D APPENDIX
Video D1. For video D1, the terms in the category "Aesthetic Quality" differed for both groups. In the AI video, the aesthetic characteristics are described through a variety of words such as "Effects", "Montage", "Impressionism", and "Abstract". In the No-AI videos, the words seem to describe the more technical aspects of the image, such as "Colors", "Technique", "Photography", and "Rhythm". For the category "Value of The Concept or Idea" both groups described the concepts "Children", "Life", "Nature", and terms that relate to the experience of memories and past experiences. A difference found in the evaluation of both videos was that in the No-AI version words relating to commercial use such as "Advertisement", "Institutional", and "Marketing" were employed. In the category "Communicative Power" some differences were observed. For the No-AI video, terms related to affective states such as "Nostalgia", "Emotional", and "Profound" were used. For the AI videos, "Nostalgia" was used as well, but more often terms such as "Confusing" and "Strange" were used to describe the piece. For the element of creativity "Expressiveness" there were differences in the responses of the participants. The AI video had one response with the word "Expressive". The No-AI video had a wider variety of responses, including the terms "Personality", "Focus", and "Revealing". In the category "Originality of The Composition", the responses were not widely different for both groups. The AI video was evaluated both positively, with the terms "Original" and "Out of This World", and negatively, with the term "Average". The No-AI video also had positive evaluations of originality, seen in the use of words such as "Diverse", and negative ones that categorize it as "Common". In regard to the category "Innovation", only the AI video had a response, the word "Rupture".
Video D2. In the videos D2, for the category "Aesthetic Quality", both the AI and No-AI versions included the terms "Effects" and "Music". The AI video also had responses that differed from the No-AI version, with words that refer to a stylistic approach, such as "Texture", "Impressionism", and "Overlap". The word "Cohesive" was also used to describe the piece. In the No-AI version, often terminology that refers to the tones of the video was used, through words such as "Black and White" and "Dark". Terms that connect with drawing were also often employed, through words such as "Illustration" and "Animation". For the category "Value of The Concept or Idea", the concepts associated with the AI version were mostly abstract, including terms such as "Symbolic", "Fragmented Word", and "Unfolding". The words used to describe the concept of the No-AI video were less abstract, often deriving from direct associations made with specific elements on the screen or the typology of the video. Words such as "Night", "Storytelling", and "Virtual Reality" were included in this section. In the category "Communicative Power" some differences were also observed. While both groups included terminology related to the emotional experience of sadness that's transmitted by the singer, in the AI version this was more frequent, through the use of words such as "Pain", "Fear", and "Regret". For the No-AI version, responses also included a negative emotional response of the participants to the video, seen in the use of terms like "Abominable" and "Exaggerated". Regarding the category "Expressiveness", the AI video was described through the use of the term "Expressive" repeatedly. For the No-AI video, there was a wider variety in the terms used to describe them, which included words like "Poetic" and "Objective". For the category "Originality of The Composition", the AI and No-AI videos both had a similar number of negative evaluations of creativity, observed in the use of words such as "Okay" and "Forgettable" for the AI video and "Basic" and "Cliche" for the No-AI video. The videos also both had one positive evaluation of creativity each, for the AI version, the word "Originality" was used and for the No-AI version the term "Imagination" was employed. Finally, only the AI video had one response for the category "Level of Innovation", which was the term "Innovative" to describe the creativity of the piece.
Video D3. For the videos D3 the terms were similar between the AI and No-AI videos. For the "Aesthetic Quality" words such as "Colorful" and "Dynamic" were employed to describe both videos. The term "Excessively Filled" was also used to describe the AI pieces. The No-AI video was described as "Random" and "Disconnected". For the category "Value of The Concept or Idea", the videos were, once again described by some similar terms, such as "Urban". Furthermore, the word "Art" was employed for the AI videos and the terms "Simple" and "Meaningless" were used to describe the No-AI videos. In regard to the category "Communicative Power" the answers strongly differed between both groups. While the AI video was described as transmitting the more negative emotional states, such as ""Insecurity" and "Anxiety", the No-AI video was described as transmitting more positive emotions, which is seen in the use of words such as "Happy" and "Funny". For the category "Expressiveness", the AI video had no answers, and the No-AI video had the answers "Expressiveness" and "Expressivity". Once again in the category "Originality of The Composition" the AI video was classified similarly to the No-AI video, both groups include the term "Originality". The AI video was classified as "Different" and "Creative" and the No-AI version as "Singular". Finally, for "Level of Innovation", the videos were also both coded with the word "Innovative", and the No-AI video was described through the term "Transformation".
Video D4. Regarding the videos D4 the experimental and control groups evaluate similarly the aesthetic aspects of the pieces. Both versions included the words "Color" and "Rhythm". For the AI version, the terms "Movement" and "Animation" were often used. In the No-AI version, the words "Sound" and "Dynamic" were employed. For the element "Value of The Concept or Idea" a wider range of words were associated with the AI video in comparison with the No-AI video. For the AI video there were used concepts such as "Harmony" and "Fantasy". For the No-AI video words such as "Simplicity" and "Abstraction" were used. On "Communicative Power" some differences were observed, as the No-AI version seemed to be perceived as more emotionally impactful, here words such as "Energetic", "Pleasant", and "Intense" were used to describe the piece. Furthermore, terms such as "Funny", "Captivating" and "Calm" were employed for the AI version. In the category "Expressiveness", for the No-AI video, merely three responses were used. Here, the use of everyday objects as elements that contribute to the creation of the song was highlighted and the words "Expressivity" and "Translation" were included. For the AI video only one answer was noted, the word "Expression". In the category "Originality of The Composition" the words used by both groups were similar. In the No-AI video, the term "Imaginative" was used twice, for the AI video the words "Different" and "Imagination" were used to describe the piece. Concerning the category "Level of Innovation", no words were used to describe the videos D4.
Video D5. For the videos D5, of the category "Aesthetic Quality" the most used words were "Movement" and "Color" (No-AI) and "Music" and "Visual effects" (AI). For the category "Value of The Concept or Idea" some of the words used were similar in both groups. "Nature" and "Life" (No-AI) and "Nature" and "Free" (AI). For this pair of videos, it is relevant to highlight that in the No-AI version, more varied words related to an abstract description of the video, such as "Symbolic" and "Surreal" were used to describe the composition. Regarding the category "Communicative Power" there were some differences in the evaluation of both videos. In the AI video the answers include serene and strong emotional states, with words such as "Calm" and "Intense". In this AI video, we also see some descriptions of dream-like states, with words such as "Magical" and "Dreamlike". In the No-AI video, words such as "Peace" and "Serenity" that relate a calm emotional state are also present in the participant's descriptions. In this version of the video, we also see terms that describe a strong emotional state, such as "Intense" and "Shock", but there are no words describing oneiric states. For the "Expressiveness" category the answers to both versions of the video were similar and included the words "Focus" and "Expressiveness". The two only noticeable differences were the presence of the term "Computerized Language" and "Metaphor" to describe the expression of the AI video. In the category "Originality of the Composition" both videos were described similarly, sharing the word "Different" and other terms that describe the creativity of the pieces, "Picturesque" (AI) and "Imagination" (No-AI). For the category "Level of Innovation" the same happened, with the AI video being described as "Novel", and the No-AI version as "Innovative".
Video D6. The videos D6 differed in the qualitative evaluations of creativity. The word "Color" was often used to describe both videos. The terms: "Abstract", "Impressionism", "Textured" and "Effects" were used for the AI video. The word "Image Noise" was also used to describe the AI video. For the No-AI version, the terms "Light", "Effects", and "Empty" were used. Regarding the "Value of The Concept or Idea", the concepts used to describe both videos varied strongly. For the No-AI video, terms that connect with an altered state of perception such as "Alternative", "Shifted Reality", "Parallel Universe", and "Exploration" were employed. On the other hand, for the No-AI version, the concepts used mostly described materialized elements, present on the video, such as "Cat" and "Bicycle", with the exception of the use of the concept "Simple". For the category